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(57) The present invention relates to a method tor 
evolving a polynucleotide encoding a plurality of genes, 
e.g. multiple genes forming a multicomponent pathway 
The method involves shufflirig of polynucleotides by 
conducting a polynucleotide amplification process on 
overlapping segments of a population of variants of a 
polynucleotide encoding a plurality of genes under con- 
ditions whereby one segment serves as a template for 
extension of another segment to generate a population 
of recombinant polynucleotides. This population is 
screened for a recombinant polynucleotide encoding a 
plurality of genes having a desired property 
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Description 

Field of the Invention 

s [0001] The present invention relates to a method for the production of polynucleotides conferring a desired phenotype 
and/or encoding a protein having an advantageous predetermined property which is selectable or can be screened 
for In an aspect, the method is used for generating and selecting or screening for desired nucleic acid fragments 
encoding mutant proteins. 

10 BACKGROUND AND DESCRIPTION OF RELATED ART 

[0002] The complexity of an active sequence of a biological macromoiecule, e.g. proteins, DNA etc., has been called 
its information content ("IC; 5-9). The information content of a protein has been defined as the resistance of the active 
protein to amino acid sequence variation, calculated from the minimum number of invariable amino acids (bits) required 
15 to describe a family of related sequences with the same function (9, 10). Proteins that are sensitive to random muta- 
genesis have a high information content. In 1974, when this definition was coined, protein diversity existed only as 
taxonomic diversity. 

[0003] Molecular biology developments such as molecular libraries have allowed the identification of a much larger 
number of variable bases, and even to select functional sequences from random libraries, f^ost residues can be varied. 

20 although typically not all at the same time, depending on compensating changes in the context. Thus a 100 amino acid 
protein can contain only 2,000 different mutations, but 20^°^ possible combinations of mutations. 
[0004] Information density is the Infomiatlon Content/unit length of a sequence. Active sites of enzymes tend to have 
a high information density. By contrast, flexible linkers in enzymes have a low information density (8). 
[0005] Current methods In widespread use lor creating mutant proteins in a library formal are error-prone polymerase 

25 chain reaction (11. 12, 19) and cassette mutagenesis (8, 20, 21, 22, 40. 41, 42). in which the specific region to be 
optimized is replaced with a synthetically mutagenized oligonucleotide. Alternatively, mutator strains of host cells have 
been employed to add mutational frequency (Greenerand Callahan M 995^ Strategies in Mol. Biol. 7: 32). In each case, 
a 'mutant cloud' (4) is generated around certain sites in the original sequence. 

[0006] Error-prone PGR uses low-fidelity polymerization conditions to introduce a low level of point mutations ran- 

30 domly over a long sequence. Error prone PGR can be used to mutagenize a mixture of fragments of unknown sequence. 
However, computer simulations have suggested that point mutagenesis alone may often be too gradual to allow the 
block changes that are required for continued sequence evolution. The published error-prone PGR protocols are gen- 
erally unsuited for reliable amplification of DNA fragments greater than 0.5 to 1 .0 kb, limiting their practical application. 
Further, repeated cycles of error-prone PGR lead to an accumulation of neutral mutations, which, for example, may 

35 make a protein immunogenic. 

[0007] In oligonucleotide-di reeled mutagenesis, a short sequence is replaced with a synthetically mutagenized oli- 
gonucleotide. This approach does not generate combinations of distant mutations and is thus not significantly combi- 
natorial. The limited library size relative to the vast sequence length means that many rounds of selection are unavoid- 
able for protein optimization. Mutagenesis with synthetic oligonucleotides requires sequencing of individual clones 

40 after each selection round followed by grouping into families, arbitrarily choosing a single family and reducing it to a 
consensus motif, which Is resynthesized and reinserted Into a single gene followed by additional selection. This process 
constitutes a statistical bottleneck, it is labor intensive and not practical for many rounds of mutagenesis. 
[0008] Error-prone PGR and oligonucleotide-directed mutagenesis are thus useful for single cycles of sequence fine 
tuning but rapidly become limiting when applied for multiple cycles. 

45 [0009] Error-prone PGR can be used to mutagenize a mixture of fragments of unknown sequence (11.12). However, 
the published error-prone PGR protocols (11, 12) suffer from a low processivity of the polymerase. Therefore, the 
protocol is very difficult to employ for the random mutagenesis of an average-sized gene. This inability limits the practical 
application of error-prone PGR. 

[0010] Another serious limitation of error-prone PGR is that the rate of down-mutations grows with the information 
so content of the sequence. At a certain information content, library size, and mutagenesis rate, the balance of down- 
mutations to up-mutations will statistically prevent the selection of further improvements (statistical ceiling). 
[0011] Finally, repeated cycles of error-prone PGR will also lead to the accumulation of neutral mutations, which can 
affect, for example, immunogenicity but not binding affinity. 

[0012] Thus error-prone PGR was found to be too gradual to allow the block changes that are required for continued 
55 sequence evolution (1 , 2). 

[001 3] In cassette mutagenesis, a sequence block of a single template is typically replaced by a (partially) random ized 
sequence. Therefore, the maximum information content that can be obtained is statistically limited by the number of 
random sequences (i.e., library size). This constitutes a statistical bottleneck, eliminating other sequence families which 
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are not currently best, but which may have greater long term potential. 

[0014] Further, mutagenesis with synthetic oligonucleotides requires sequencing of individual clones after each se- 
lection round (20). Therefore, this approach is tedious and is not practical for many rounds of mutagenesis. 
[001 5] Error-prone PGR and cassette mutagenesis are thus best suited and have been widely used for fine-tuning 
5 areas of comparatively low information content. An example Is the selection of an RNA ligase ribozyme from a raridom 
library using many rounds of amplification by error-prone PGR and selection (13). 

[0016] It is becoming increasingly clear our scientific tools for the design of recombinant linear biological sequences 
such as protein, RNA and DNA are not suitable for generating the necessary sequence diversity needed to optimize 
many desired properties of a macromolecule or organism. Finding better and better mutants depends on searching 
10 more and more sequences within larger and larger libraries, and increasing numbers of cycles of mutagenic amplifi- 
cation and selection are necessary. However as discussed above, the existing mutagenesis methods that are in wide- 
spread use have distinct limitations when used for repeated cycles. 

[0017] Evolution of most organisms occurs by natural selection and sexual reproduction. Sexual reproduction en- 
sures mixing and combining of the genes of the offspring of the selected Individuals. During meiosis, homologous 

15 chromosomes from the parents line up with one another and cross-over part way along their length, thus swapping 
genetic material. Such swapping or shuffling of the DNA allows organisms to evolve more rapidly (1, 2). In sexual 
recombination, because the inserted sequences were of proven utility in a homologous environment, the inserted 
sequences are likely to still have substantial information content once they are inserted into the new sequence. 
[0018] Marton et al.,(27) describes the use of PGR in vitro to monitor recombination in a plasmid having directly 

20 repeated sequences. Marton et al. discloses that recombination will occur during PGR as a result of breaking or nicking 
of the DNA. This will give rise to recombinant molecules. Meyerhans et al. (23) also disclose the existence of DNA 
recombination during in vitro PGR. 

[0019] The term Applied Molecular Evolution ("AME") means the application of an evolutionary design algorithm to 
a specific, useful goal. While many different library formats for AME have been reported for polynucleotides (3, 11-14), 
2S peptides and proteins (phage (15-17), lad (18) and polysomes, in none of these formats has recombination by random 
cross-overs been used to deliberately create a combinatorial library. 

[0020] Theoretically there are 2,000 different single mutants of a 100 amino acid protein. A protein of 100 amino 
acids has 201°° possible combinations of mutations, a number which is too large to exhaustively explore by conventional 
methods. It would be advantageous to develop a system which would allow the generation and screening of all of these 

30 possible combination mutations. 

[0021] Winter and coworkers (43,44) have utilized an in vivo site specific recombination system to combine light 
chain antibody genes with heavy chain antibody genes for expression in a phage system. However, their system relies 
on specific sites of recombination and thus is limited. Hayashi et al. (48) report simultaneous mutagenesis of antibody 
CDR regions in single chain antibodies (scFv) by overlap extension and PGR. 

35 [0022] Caren et al. (45) describe a method for generating a large population of multiple mutants using random ]n 
vivo recombination. However, their method requires the recombination of two different libraries of plasmids, each library 
having a different selectable marker. Thus the method Is limited to a finite number of recombinations equal to the 
number of selectable markers existing, and produces a concomitant linear increase in the number of marker genes 
linked to the selected sequence(s). Garen et al. does not describe the use of multiple selection cycles; recombination 

40 is used solely to construct larger libraries. 

[0023] Galogero et al. (46) and Galizzi et al. (47) report that in vivo recombination between two homologous but 
truncated insect-toxin genes on a plasmid can produce a hybrid gene. Radman et al. (49) report in vivo recombination 
of substantially mismatched DNA sequences in a host cell having defective mismatch repair enzymes, resulting in 
hybrid molecule formation. 

45 [0024] It would be advantageous to develop a method for the production of mutant proteins which method allowed 
for the development of large libraries of mutant nucleic acid sequences which were easily searched. The invention 
described herein is directed to the use of repeated cycles of point mutagenesis, nucleic acid shuffling and selection 
which allow for the directed molecular evolution in vitrooi highly complex linear sequences, such as proteins through 
random recombination. 

so [0025] Accordingly, it would be advantageous to develop a method which allows for the production of large libraries 
of mutant DNA, RNA or proteins and the selection of particular mutants for a desired goal. The invention described 
herein is directed to the use of repeated cycles of mutagenesis, in vivo recombination and selection which allow for 
the directed molecular evolution in wVoand in vitrooi highly complex linear sequences, such as DNA, RNA or proteins 
through recombination. 

55 [0026] Further advantages of the present invention will become apparent from the following description of the inven- 
tion with reference to the attached drawings. 
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SUMMARY OF THE INVENTION 

[0027] The present invention is directed to a method for generating a selected polynucleotide sequence or population 
o( selected polynucleotide sequences, typically in the form of amplified and/or cloned polynucleotides, whereby the 
selected polynucleotide sequence(s) possess a desired phenotypic characteristic (e.g., encode a polypeptide, promote 
transcription of linked polynucleotides, bind a protein, and the like) which can be selected for. One method of identifying 
polypeptides that possess a desired structure or functional property, such as binding to a predetemnined biological 
macromolecule (e g . a receptor), involves the screening of a large library of polypeptides lor individual library members 
which possess the desired structure or functional property conferred by the amino acid sequence of the polypeptide. 
[0028] In a general aspect, the invention provides a method, termed -sequence shuffling', for generating libranes of 
recombinant polynucleotides having a desired characteristic which can be selected or screened for. Libraries of re- 
combinant polynucleotides are generated from a population of related-sequence polynucleotides which compnse se- 
quence regions which have substantial sequence identity and can be homologously recombinedjn vrtro or in yiva In 
the method, at least two species of the related-sequence polynucleotides are combined in a recombination system 
suitable for' generating sequence-recombined polynucleotides, wherein said sequence-recombined polynucleotides 
comprise a portion of at least one first species of a related-sequence polynucleotide with at least one adjacent portion 
of at least one second species of a related-sequence polynucleotide. Recombination systems suitable for generating 
sequence-recombined polynucleotides can be either: (1 ) in vitro systems for homologous recombination or sequence 
shuffling via amplification or other formats described herein, or (2) in vivo systems for homologous recombination or 
site-specific recombination as described herein. The population of sequence-recombined polynucleotides compnses 
a subpopulation of polynucleotides which possess desired or advantageous characteristics and which can be selected 
by a suitable selection or screening method. The selected sequence-recombined polynucleotides, which are typically 
related-sequence polynucleotides, can then be subjected to at least one recursive cycle wherein at least one selected 
sequence-recombined polynucleotide is combined with at least one distinct species of related-sequence polynucleotide 
{which may itself be a selected sequence-recombined polynucleotide) in a recombination system suitable for generating 
sequence-recombined polynucleotides, such that additional generations of sequence-recombined polynucleotide se- 
quences are generated from the selected sequence-recombined polynucleotides obtained by the selection or screening 
method employed. In this manner, recursive sequence recombination generates library members which are sequence- 
recombined polynucleotides possessing desired characteristics. Such characteristics can be any property or attribute 
capable of being selected for or detected in a screening system, and may include properties of: an encoded protein, 
a transcriptional element, a sequence controlling transcription, RNA processing, RN A stability chromatin conformation, 
translation, or other expression property of a gene or transgene. a replicative element, a protein-binding element, or 
the like such as any feature which confers a selectable or detectable property 

[0029] The present invention provides a method for generating libraries of displayed polypeptides or displayed an- 
tibodies suitable for affinity interaction screening or phenotypic screening. The method comprises (1 ) obtaining a first 
plurality of selected library members comprising a displayed polypeptide or displayed antibody and an associated 
polynucleotide encoding said displayed polypeptide or displayed antibody, and obtaining said associated polynucle- 
otides or copies thereof wherein said associated polynucleotides comprise a region of substantially identical sequence, 
optionally introducing mutations into said polynucleotides or copies, and (2) pooling and fragmenting, by nuclease 
digestion, partial extension PGR amplification. PGR stuttering, or other suitable fragmenting means, typically producing 
random fragments or fragment equivalents, said associated polynucleotides or copies to form fragments thereof under 
conditions suitable for PGR amplification, performing PGR amplification and optionally mutagenesis, and thereby ho- 
mologously recombining said fragments to form a shuffled pool of recombined polynucleotides, whereby a substantial 
fraction (e g , greater than 10 percent) of the recombined polynucleotides of said shuffled pool are not present in the 
first plurality of selected library members, said shuffled pool composing a library of displayed polypeptides or displayed 
antibodies suitable for affinity interaction screening. Optionally, the method comprises the additional step of screening 
the libraiy members of the shuffled pool to identify individual shuffled library members having the ability to bind or 
othenwise interact (e.g., such as catalytic antibodies) with a predetermined macromolecule, such as for example a 
proteinaceous receptor, peptide, oligosaccharide, virion, or other predetermined compound or structure. The displayed 
polypeptides antibodies, peptidomimetic antibodies, and variable region sequences that are identified from such li- 
braries can be used for therapeutic, diagnostic, research, and related purposes (e.g., catalysts, solutes for increasing 
osmolarity of an aqueous solution, and the like), and/or can be subjected to one or more additional cycles of shuffling 
and/or affinity selection. The method can be modified such that the step of selecting is for a phenotypic characteristic 
other than binding affinity for a predetermined molecule (e.g., for catalytic activity stability, oxidation resistance, drug 
resistance, or detectable phenotype conferred on a host cell). 

[0030] In one embodiment, the first plurality of selected library members is fragmented and homologously recombined 
by PGR in yitra Fragment generation is by nuclease digestion, partial extension PGR amplification. PGR stuttering, or 
other suitable fragmenting means, such as described herein. Stuttering is fragmentation by incomplete polymerase 
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extension of templates. A recombination (ormat based on very short PGR extension times was ernployed to create 
oartial PGR products, which continue to extend off a different template in the next (and subsequent) cycle(s). 
r0031 1 In one embodiment, the first plurality of selected library members is fragmented in vitro, the resultant fragments 
transferred into a host cell or organism and homologously recombined to form shuffled library members in yiite: 
5 r00321 In one embodiment, the first plurality of selected library members is cloned or amplified on episomally repli- 
cable vectors, a multiplicity of said vectors is transferred Into a cell and homologously recombined to form shuffled 

rSo331 ""ronTemSment. the first plurality of selected library members is not fragmented, but is cloned or amplified 
on an episomalkf replicable vector as a direct repeat or indirect (or inverted) repeat, which each repeat compnsing a 
10 distinct species of selected library member sequence, said vector Is transferred into a cell and homologously recom- 
bined by intravector or inter-vector recombination to form shuffled library members in wa ^. ..... , 

(0034] In an embodiment, combinations of jn vitro and in vivo shuffling are provided to enhance combinatorial diver- 

rM35] The present invention provides a method for generating libraries of displayed antibodies suitable for affinity 
15 Interaction screening. The method comprises (1) obtaining a first plurality of selected library members comprising a 
displayed antibody and an associated polynucleotide encoding said displayed antibody, and obtaining said associated 
Dolvnucleotides or copies thereof, wherein said associated polynucleotides comprise a region of substantially identical 
variable region framework sequence, and (2) pooling and fragmenting said associated polynucleotides or copies to 
form fragments thereof under conditions suitable for PGR amplification and thereby homologously recom bin ing said 
20 fragments to form a shuffled pool of recombined polynucleotides comprising novel combinations of CDRs. whereby a 
substantial fraction (e.g.. greater than 10 percent) of the recombined polynucleotides of said shuffled pool comprise 
CDR combinations which are not present in the first plurality of selected library members, said shuffled pool composing 
a library of displayed antibodies comprising GDR pemnutatlons and suitable for affinity interaction screening. Optional 
the shuffled pool Is subjected to affinity screening to select shuffled library members which bind to a predeterrnKied 
25 epitope (antigen) and thereby selecting a plurality of selected shuffled library members. Optionally, the plurality of 
selected shuffled library members can be shuffled and screened rterativety. from 1 to about 1000 cycles or as desired 
until library members having a desired binding affinity are obtained. 

r00361 Accordingly, one aspect of the present invention provides a method for introducing one or more mutations 
into a template double-stranded polynucleotide, wherein the template double-stranded polynucleotide has been 

30 cleaved or PGR amplified (via partial extension or stuttering) into random fragments of a desired size, by adding to the 
resultant population of double-stranded fragments one or more single or double-stranded oligonucleotides, wherein 
said oligonucleotides comprise an area of identity and an area of heterology to the template potynucleottde; denaturing 
the resultant mixture of double-stranded random fragments and oligonucleotides into single-stranded fragments; incu- 
bating the resultant population of single^stranded fragments with a polymerase under conditions which result in the 

35 annealing of said single-stranded fragments at regions of identity between the single-stranded fragments and formation 
of a mutagenized double-stranded polynucleotide; and repeating the above steps as desired. 
r0O371 In another aspect the present invention is directed to a method of producing recombinant proteins having 
biological activity by treating a sample comprising double-stranded template polynucleotides encoding a wild-type 
protein under conditions which provide for the cleavage of said template polynucleotides into random double^tranded 

40 fragments having a desired size; adding to the resultant population of random fragments one or more single or double^ 
stranded oligonucleotides, wherein said oligonucleotides comprise areas of identity and areas of heterology to the 
template polynucleotide; denaturing the resultant mixture of double-stranded fragments and oligonucleotides into sin- 
gle-stranded fragments; incubating the resuftant population of single-stranded fragments wrth a potyrnerase under 
conditions which result in the annealing of said single-stranded fragments at the areas of identrty and f orrnation of a 

45 mutagenized double-stranded polynucleotide; repeating the above steps as desired; and then expressing the recom- 
binant protein from the mutagenized double-stranded polynucleotide. 

r00381 A third aspect of the present invention is directed to a method for obtaining a chimeric polynucleotide by 
treating a sample comprising different double-stranded template polynucleotides wherein said different template poly- 
nucleotides contain areas of identity and areas of heterology under conditions which provide for the cleavage of sad 

50 template polynucleotides into random double-stranded fragments of a desired size; denaturing the resultant random 
double-stranded fragments contained in the treated sample into single -stranded fragments; incubating the resul^t 
single-stranded fragments with polymerase under conditions which provide for the annealing of the single-stranded 
fragments at the areas of identity and the formation of a chimeric double-stranded polynucleotide sequence comprising 
template polynucleotide sequences; and repeating the above steps as desired. 

55 r0O391 A fourth aspect of the present Invention is directed to a method of replicating a template polynucleotide by 
combJiing in vitro single-stranded template polynucleotides with small random single-stranded fragments resulting 
from the cleavage and denaturation of the template polynucleotide, and Incubating said mixture of nucleic acid frag- 
ments in the presence of a nucleic acid polymerase under conditions wherein a population of double -stranded template 
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r^or'^rtn^n'Talsoprovides the use of polynucleotide shuffling, /n vtoand/or in «Voto shuffle polynucleotides 
encodinaeolypeptidas and/or polynucleotides comprising transcriptional regulatory sequences. 
[iS«rThe invention also provides the use of polynucleotide shuffling to shuffle a poputetion of >«ral genes (e g 
capsid proteins, spike glycoproteins, polymerases, proteases, etc.) or viral genomes (e.g.. paramyxovirdae, or- 
hCovi'^s, herpesviruses, retroviruses, reoviruses, rhinoviruses, etc.). In an embodiment, the ^vention provdes 
ametLfor shuffling sequencesencodingall or portionsofimmunogenicviralproteinstogenerate novel combinato^^^ 
of epitopes as well as novel epitopes created by recombination; such shuffled v,ral proteins may '^P'j^^ Ws or 
combinations of epitopes which are likely to arise m the natural environment as a consequence of viral evolution (e. 
D such as recombination of influenza virus strains). ..,„k 
?0M21 The invention also provides the use of polynucleotide shuffling to shuffle a poputetion of protein variants, such 
i^lnrcLl^-related, st'curally-reteted. and/or functionally-related enzymes and/or '""'^'^"^^ "J" 
create and iden.Ky advantageous novel polypeptides, such as enzymes having -"^^^"^ P^^f 
ature profile, stability, oxidation resistance, or other desired feature which can be selected for. M«'h°ds su«able tor 
moIecSlar evolution and directed molecular evolution are provided. Methods to focus selection pressure(s) upon spe- 
cific portions of polynucleotides (such as a segment of a coding region) are provided. 

l^] The invention also provides a method suitable for shuffling polynucleotide sequences for generating gene 
therapy vectors and replication-defective gene therapy constructs, such as may be used for human gene therapy 
including but not limited to vaccination vectors (or ON A-based vaccinatton, as well as antl-neoplastK: gene therapy and 

i;5:irte1Zt"vides a method for generating an enhanced green fluorescent protein (GFP, and polynu^ 
cleotides encoding same, comprising performing DNA shuffling on a GFP encoding expression ^^'^^''"^ 
or screening for variants having an enhanced desired property, such as enhanced fluorescence. In a var^tion a,^ 
embodiment comprises a step of error-prone or mutagenic amplification, propagation in a f '^'^ 'p^^^^^^^' 

cell having a hypermutational phenotype; mutK etc.; yeast strains such as those described in Klein (1 995) Proqr. Nuc L 
AcL ResC BIoLSI : 217, Incorpoi^ted herein by reference), chemical mutagenesis, or sile^irected -utagenesrs^ 
inanembcdimenl. the enhanced GFP protein comprises a point mutation outside '^e chromophore region (arn,^^^ 
acids 64.69). preferably in the region from amino acki 100 to amino acid 1 73, with speci^c pre^^ 
residue 100, 154. and 164; typicaUr, the mutation is a substNution mutation, such as F100S, M154T or V164A^ n an 
embodiment, the mutation substitutes a hydrophilic residue for a hydrophobe residue^ In an 
mutations are present in the enhanced GFP protein and its encoding polynucleotide. The »iven ion also Pr-^'des *e 
use of such an enhanced GFP protein, such as for a diagnostic reporter for assays and high throughput screening 

^Mttl ^rinventlon also provides for improved embodiments for performing jn vjtro sequence shuffling. In one 
Lec the improved shuffling method includes the addition of at least one additive which enhances the rate or extent 
Zannll ing or recombinalL of rested-sequence polynucleo.kfes. In an embodiment, the additr^e is po^ethy^e 
qlvcol (PEG) typically added to a shuffling reactbn to a final concentration of 0.1 to 25 pereent, often to a final con- 
centra ion 0 2 s'to 15 percent, to a final concentration of about 10 percent. In an embodiment, the addlive is dextran 
suJate °yJlcalV added to a shuffling reactton to a final concentration of 0.1 to 25 percent, often at about 0 percent. 
In an embodiment, the additive is an agent which reduces sequence specificity of reannealing and promotes prornis- 

uoushybridizationand/orrecombinatiaijnvttralnanaltemativeembodiment.theaddrt,ve.san^ 
sequence specificity of reannealing and promotes high fideWy hybrldizat»n and/or recombina ionjnvjtp, Other long- 
chain oolvmers which do not Interfere with the reactton may also be used (e.g., polyvinylpyrrolidone, eta). 
S in one ^pect, the Improved shuffling method includes the addition of at least one additive which is a cat^nic 
STgent. Examples ot suitable cationic detergents Include but are no, limited to: -'Vltri-^thylammrx^^^^ 
(CTAB) dodecyltrlmethylammonlum bromide (DTAB), and tetramethylammonium chloride (™AC), and the liKe^ 

00471 In one aspect, the improved shuffling method includes the addition of at least one additr/e which .s a recombi- 
protein that catalyzes or non-cataly,k=al^ enhances hcxnologous pairirjg and/or f rand exchange Ex^ 
amples of suitable recombinogenic proteins include but are not limited to: E. col, recA protein, the T4 '^^^P'^^^'^^^ 
^rprotein from UsUlago maydis. other recA fami^ recombinases from other species, single strand binding protein 

protein; (or example, mutant sequences encoding recA can be shuffled and Improved heat-stable variants selected by 

recursive sequence recombination. , ^ ■ ^ n /t,.« «♦ «i 

r00481 Non-specific (general recombination) recombinases such as Topoisomerase 1. Topoisomerase II 
1980 J Biol Chem. 255: 5560; Trask et al. (1984) EMBOJ, 3: 671. Incorporated herein by reference and the l.ke 
can be usedtocatalyz ^Mltro recombination reactions to shuffle a plurality of related sequence poh^nucelotide spec.es 

bv the recursive methods of the invention. 

[0049] In one aspect, the improved shuffling method includes the addition of at least one additive wh.ch is an enzyme 
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include, but are not limited to: Thermus flavus DNA polymerase (Tfl) 

Thermus thermophilus DNA polymerase (Tth) 
Thermococcus litoralis DNA polymerase (Tli, Vent) 
pyrococcus Woesei DNA polymerase (Pwo) 
Thermotoga maritima DNA polymerase (UltMa) 
Thermus brockianus DNA polymerase (Thermozyme) 
Pyrococcus furiosus DNA polymerase (Pfu) 
Thermococcus sp. DNA polymerase (9-Nm) 
Pyrococcus sp. DNA polymerase ('Deep Vent') 
Bacteriophage T4 DNA polymerase 
Bacteriophage 17 DNA polymerase 
E. coli DNA polymerase I (native and Klenow) 
E. coli DNA polymerase III. 

roOSO] in an aspect, the imp«>ved shuffling method comprises the modification wherein at least ^Vf «^ 
vIT^i r « e«ension with a polymerase) of reannealed fragnnented library member polynucleotides is conducted 

natured and ^^°l^2^aZ<i amolification provides a substantial fraction of incompletely extended products, is termed 
ZSZTntZT^:::!':^^^^^ -d the incompletely extended products reanneal to and prime ex- 

STn"r:m;rdir^^^^^^ 

^.^ d brlaks size can be controlled by the fraction of uraclKontaining NTP ,n the PGR mix. 

selected to one or more rounds of recursive sequence recc«inbinalion and selection (or screening). If desired, option 



7 



EP 0 911 396 A2 



=,K, .=i.rt«ri lihran, members (rom each separate pool may be recombined (integrated) in latter rounds of shuffling. 
2,^atl muTiple sepa« e altal poo^s may I used. Inbreeding, wherein selected (or screened) library mem^ 
~apc^?arecrossedwnh each other by the recursive seq^^^ 

be performed alone or In combinatton with outbreeding, wherein library members of different pools are crossed wrth 
each other by the recursive sequence recombination methods of the invention. ..,«,„h „rnH„rt, 

f^rtn an embodiment of the Invention, the method comprises the further step of removirig non-shuffled products 
fTpa -r^a" sequences) from sequence-recomblned po^nucleojides produced ° < « 

methcSs Non-shuffled products can be removed or avoided by performing amplification wrth, ( ) a first PGR P^me 
Ifch hvb^tfzes to a first parental polynucleotide species but does not substantially hybndize to a second parental 
1^CZZsl^tnT(2)a.ocL PCR primer which hybridizes to a second parental po^nucleotide species 
TSTs^^^M^^i^e to the firs, ^rental polynucleotide species, "^"J^^^ 
MmSs .Sr^prlsing the portton ol the first parental sequence which hybridizes to the first PCR pnmer and also com- 
She^rrthe s'econd parental sequence which hybridizes to the second PCR primer, thus on^, sequence- 

S the a emative shuffLg method includes the use of Inter-plasmidic recombination wherein I* ^ 'os of se- 

qrnce-recombined polynucleotide sequences are obtained by genetic recombination jn vjio of ^"^P^^^te or non- 

compa'b^ multicopy plasmids Inside suitable host cells. When non-compatible plasmids are used, 

,^C^hasa distinct selectable marker and selction for retention of each desired plasmid type is applied. The re ated- 

rjquencewl^i 

V ^rs Sy baSerial P^'-^^ «ach having a distinct and separate^ selectable marker 9-^(-^■ « ''Xdable 
Tnce gene^luLble host cells are transformed with both species of plasmid and cells «;<P™=«'."9 ""'"^ ^/l" 
marker genes are selected and sequence-recomblned sequences are recovered and can be subjected to additional 

rounds of shufflinq by any Of the means described herein. 

S n one aspect, the alternative shuffling method Includes the use of '"^^'^^^'''1''^^"''^^^^^^ 
Es dsequence-recomblned poVnucleotide sequences are obtained by genetfc '-^^«^^^.^^,^^ZZ 
Inverted sequence repeats located on the same plasmid. In a varation, the sequences to ''.^ ^^^^^^atirJ^st^^^^ 
hv site-soeclflc recombinatkjn sequences and the po^rnucleotides are present in a site-specrtic recombination system, 
by f «-^P«'^^=^^=T^^ ,,935, MolMkrobioL 15: 593, incorporated herein by reference) and can include 

Ls mobSe ^netlc elements both In prokaryotes and eukaryotes. Shuffling can be used to improve the (^^"^"^^ 
mobre S^^^^^^ These high frequency recombination vehk:les can be used for the rapid optimi^tion of large 
sequences v^^a "of large sequence blocks. Recombination between repeated, interspersed, ^nddiverged DNA 
se^ n .Xc^ld-homUous-sequences, istypically suppressed in norrha, cells. How^^^^ 
cells, this suppression is relieved and the rate of intrachromosomal recombination is increased (Petit et al. (1996) 
fifinetics 129- 327 incorporated herein by reference). ,. . , u«t,k, 

iilriT^ asoect of me inventton, mutator strains of host cells are used to enhance recombination of more higf^ly 
Stch^sS^ rlt^d Po^^^ Bacter^ls strains such as MutL, MutS, MutT, or MulH or other cells 

Sens to hSrExaS^es of such mutagens Include but are not limited to: MNU, ENU, MNNG, nitrosourea, 
BuDR andmerke^U^^^^^^^^ 

such as bv irraltion cells used to enhance jn vivo recombination. Ionizing radiation and clastogenic agents can 
: ^ be userotihTncl mutational frequency aTd/or to enhance recomb^atton and/or to effect polynucleotide frag- 



mentation. 

BRIEF DESCRIPTION OF THE DRAWINGS 



r0O61l FiQure 1 is a schematic diagram comparing mutagenic shuffling over error-prone PCR; (a) the initial library; 
Tpo^Io 1^^^^^^^^^ 

'Shuffling') • (f) pool of selected sequences in second round of affinity selection after shuffling, (c) error-prone PCR. 
e) pool of selected sequences in second round of affinity selection after error-prone PCR traaments 
00^2] Figure 2 illustrates the reassemb^ of a 1.0 kb UcZ alpha gene fragment ^^^^^^'^^ ^^^^^^^ 
a) Photograph of a gel of PCR amplified DNA fragment having the LacZ alpha gene, (b) Photograph of a gel of DNA 
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P^'"^®^®- . ■« . oti„„ «f th« I ar7 aloha aene stop codon mutants and their DNA sequences. 

sembly process of the LacZ alpha gene. ^ ^^^^^ IL^.g gene (H) 

Kl'^pfgre is a schematic digram o. *e an,«=ody CDR spuming mode, system using the scFv o, anti-rabbi, 

of selection . ^ nRR^?? Sfi-BL-LA-Sf i and in vivo intraplasmidic recombination via direct re- 

a functional ^eta- lactamase gene. „oR322.sfi-2Bla-Sfi and in vivo intraplasmidic recombination va direct re- 

Ss^sraJ~^=^^ 

i::Tfi^:T::^::j'Zr^^^o^ teeing ... ....^^ c •^^f^'^'^^^;;^:^- 

vector libraries by shuffling. . derived from pBADlS 

detectable fluorescence in this fraction. wildtvoe GFR Panel (A) shows 
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Pane, ,B,s.owsares.rlc.,o..ap indicating .c«^ 

Sis?crrcrcr:p=^^^^^^^ 

sequences of distinct library members. numbers refer to reading 
10081] Rgure 21 schematically sho^ vana^^on^^^ 

(IG: immunoglobulin exon; IFN: interferon exon) Hiaoram tor several tiuman genes, showing that pre- 

0082] Figure 22 schematical^ shows ^'^^^^^^^^^^ a splicing module (or shuffling 

ferred units (or shuffling exons begin and end ,n ^^^JJ^ "^^^^^ ' ^^^„g f,^^ at each end. 

exon) can comprtee -""'''P'-t-^'y-^^rcScR ^^^^^^^^^ '° <"°''^ 

[0086] Figure 26 shows plasmid-plasmid recombination. 

[0087] Figure 27 shows plasmid-vlrus recombination. 

rOOSSl Figure 28 shows virus-virus recombination. 

[0089] Figure 29 shows plasmid-chromosome recombination. 

ir]'Kf^«rsT-^^^^^^ 

signal of different Abi>hag9 clones for eight human protein targets. 

Figure 33 shows Ab-phage recovery versus mutagenesis method. 

nPRCRlPTION HF THF PREFF PHFn F^/lBODl^/lE^^TS 

(0094] Thepresentinventionrela,es.oamethodfornuc,e.ac.^^^^^^^^^ 

and A applicatkxi to mutagenesis of DNA ''^^"^ „ pa*u^r the present invention ateo relates 

rm=r;r^rx«r^^^^^^^^^ 

mutants; in embodiments where a '"^^^J"^^ "'.^T^ZC^^^sis)^^^ method has particuter 
can compose the resu«ant ^-f^^^-" ^^^^"^^^^^^ fragment(o) may be se- 

re::n^rj::pr:i=^^^^ 
— ^^rrnrin^i:— ^ 



Definitions 
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ra:l,pl.sonwil!M0iden,i,yan.co.pa.^^^^ 
byme homology alignmentalgorithmoNeedlemana.^^^^^ 

.,an,iand«nwy as use^^^^^^^^^^ 

^To^rp: JsC cl a. ,L. 99 pe.en. sequence '^^^^^I'^^Z'ST^ 
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Lnomir single-stranded nuclei acid sequence or s^^^^^^^^^ 
LXroe^etrc^dfaCS 

Srrerr ret rat^r ^^^^^^^^^ 'vV . •pa... sequence- can indicate a sta.ng . 

or areas of the polynucleotides are heterologous. polynucleotide comprises regions which are wild-type and 

eratlng partial length copies o, a parent ^^^^/^^^'^^^P^^^^^ l^^Z more parental sequences. 

amplificatton, or other means of ^"^'^^^^^^^^^^'^^IZ^ components such as polynucleotides, nuclete 

[0114] The temi -popuWion" as used components which belong to the same family 

?:^r~-spocl«onuc.eicacidfragment-m^^^^^^^^^^ 

fragments. in the seauence of a wild-type nucleic acid sequence or changes in the 

direction is the carboxy-termina. director,, "^^^ ^^^^^ ^ the 5' end; the lefthand direction of 

i,ied otherwise, the lefthand end of direction of 5' to 3' addnion of nascent 

rr:r=ran=^^ 

coding RNA transcript are referred to as '"^f object refers to the fact that an object can be 

:r:Krs::::.;=r3er:^^^^^ 

ological (undiseased) individual, such as would be ^^^^^^^^ , „.^,,e of chemical compounds, an array 
[0,19] Theterm-agen.-isusedhere,n.odno,eac^^^^^ 

of spatially localized compounds (e.g., a VI-Sll^S pepiioe array p , bacteriophage antibody (e.g., scFv) 
ecuLrray),abiologicaimacrorm>lecule,aaW^ 

display library, a polysome pept«fe ^'^P'^^' °' ^" f ^g'^^^^ potential acth/ity as antineoplastics, 
fungi, or animal(particularlymammanan)ce^so,,^^^^^^^^^ 

; antl-lnflammatories, orapoptosis '^"1^'°^''^ ^/^^^.^ MmZL , an agent which selectiveV inhibfts a binding 

•rci'tr::rpr:=^^^^^^^^^^ 

inclusion In screening assays described hereinbelow. 
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motor basis it is more abundant than any otlier '^''^'^^T^^^^,^,,,, at least about 50 percent (on a 
a substantally purified fraction is a composmon '^''^"^'^^'^ZZ^P^'O composition will comprise rrxjre 
molar basis) of all macromolecuter spec.es P™^ ■ „^^™thV^^^ preferably, the object 

than about 80 to 90 percent of all "^'=''^;Z^^^^^T^^rZ^^ in the composition by conven- 
species is purified to essential '^"-°9«"«2S^r^^Zs,s'n en^^^^ macromolecular species. Solvent 

tional detection methods) v^erein the S are not considered macromolecular species, 

species, small molecules (<500 Daltons), and elemental "".^P^,*^„^ ' ""^^^ s„ength, viscosity, and 

[0121] AS used herein the tern, -physlologlca '°'^'^'°2Tc^^^lnZ^^\^>o^^*^ exist Wracellularly in 
L biochemical parameters which are compauble with a yeast cell grown under 

a viable cultured yeast cell or mammalian cell. ^^^^^^^ ^^Z.^r^lcn conditions for in vjlio tran- 
typical laboratory culture conditions are P*'y^'°''?^=^'^'^'7;,,f|" ^^^^^^^^^^ cc»,dltions comprise 50-200 

rc^^r^rr^s^s^inr^^^^^^^ 
^Hci^H5r:rho'^:;^^.-^^^^^^^^ 

brane fractions and/or antifoam agents and/or . . ^ polynucleotide and a second 

[01221 Specifichybridizationisdefinedhere^as efo^^^^^^^^ 

polynucleotide (e.g., a P,°|V-f ^f''^.^" ^3"*,^^^^^^^^^ under stringent hybridization 

r:rrei=rss:rsr:^^^^^^^^^^^^^^ 

[0123] AS used herein, the term -single^ha.n „^ '° ^^ffJ^P".^^^^^^^^ and which may 

domal, in polypeptide lU^Kage. generally '"^^ZL ^^1'm^'<^^'^^^^ 

compriseadditionalaminoacid sequences at the amino-and/c^c^^^^^^ P ^ single<hain 

may comprise a tether segment for lining to the ^^^^^ "9 "^^""^^^ ^^re po^^p.ide segments of at least 1 0 
antibody. Single-chain antibodies are 9»"«'~°"^^^^^^^^ (e.g., see Thelmun2S!2b: 

contiguous amino ackls substantially encoded by genes of the """""^ "res f H^nio F W Alt, andXH. Rabbitts, 
„lJ.neSuperfami^ ,A.F.Wi.l^sand/.N.Barc^,^ n,r^ 

Sy^iTprrs^rrr^^^^ 

ST'as used herein, the term -complementar^yKfeferrnining '^J^^Z'Z'X^^ r™"" 
L elplified by the Kabat ^^"3— lo^C^^^^^^ KabL et at., 

iable loops (Chothia and Lesk (1 987) J. Mol- Biol, m 90i . ° J; BothlSTwD) (1 987)- and Tramontano 
SequencesUroteinsoflmmunologicallnterest(Natiooa^n~ 

et al. (1990) JJtoLBioL 215; 175). Variable region domains 'VP^^'iy "^^n^^^^^ aRhough variable do- 

105-1 5 amino adds 01 a natural^^curring immunoglobulin =ha>n (e^Q- . am^o^^^ 

mains somewhat shoner or longer are also suitable '"^ 'f^^^Tc^s^fsTa^^^^^^^^ region Werrupfed by three 
r012S] An immunoglobulin light or heavy chain variable region <=<>"s«ts of a « ^^j^g, defined 

: hypeiriablere9tons,alsocalledCDR's.Tl.eexle„^^^^^^^^^ 

(see. -Sequencesof Proteins of lmmunolog««l ^"'^^^'^J^^I^IZmM 
Sen,ices,Bethesda,MD(1987)).ThesequencesolthelrameworW 

conserved within a species. As used herein, a '•'"■^ '^^J^Zln dfna^^^^ human immu- 

identteal (about 85% o, more, usually 80-95% - m°«) °^^^^^ regS the consLent light and 

" ^rtinrsrr^ra^X"^^^^^^^^^ 

?;«rl^usedherein,the,em,.ar.blesegment-.^^^^^^^^^^ 

. :=rde:r=^^^^ 
ix^^ar^'r^^c;^^^^^^^^^^^ 

an antibody fragment, a nucleic acid binding protein, a receptor protein, and the like. 
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or scaffolding moti^, rJ^rleZto a set d po^nucleotida sequences .ha. encodes a se. of 

iroLp";dtar.:.rsr^c::^ronccded.,.h^^ 

pro.eins containing .hose '^"f P«P'"f;f,„^„. ^ of sequences tha. have limited variability, so .ha. 

[0129] As used herein, .he .erm P««"f°^^"^ J! ° 
,orexarnple.he degree o,res,duevar^.l.^ya^^^^^ 

posilion, bu. any pseudorandom posi on is allowed swne aegre sequences tha. are selected 

[0130] Asusedhereln,thetem,Mef,nedsequenceframework™ta^ 
Lanonrandom basis, generally on the basis ofexpe^^^^^^^^^^ 

framework may comprise a se. of ammo acd sequences '•^^'/'^ P^*°™ sequence kernal' is a 

a leucine zipper hepted repeat motif, a zinc-ftnger domain. ^"""9 ^^^^^^ 10-mer sequence 

set o. sequences Which encompassalim^edscopeov^^^^^^^^^^ 

of.he20conven.ionalan.noac^scan ea^^d ^ ^^^^.^ ^^^..^^^ 3, ^..^^^ 

20 convan.ional ammo acids can be any ot (<;u) sequonu maximum number 
posittons and/or overall, (3)adefinedsequencekerna.isas^^^^^^^^^^^ 

of po.en.ial sequences « each rescue comprises variant and in- 
land/or altowable ""oonvenUonalamino^iminoacids)^ A defined s^^^^^^ J ^^^^^^^ ^ 

[013?] Asusedherem-llgandV^erstoamo,^^^^^^^ 

reCrridingp« 

[0134] 

suchasaDNAbindingproteinandar^dom^^^^ 

e.g., so that the random W' f^^" "^"^ ° SSto a nlkage o. polynucleotide elemen.s in a lunc.ional rete- 

the present methods selection is performed by man. 
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Methodology 

Rar^^umants will contain a large number ot the bes, (eg. highest allinfty) CDRs and these rare shufHants 

a prr, roCrnS Llbo^v sequences can be pern,.ta.ed in up .o 100. *^^^^^^^^ 
S^v?™sSe r^ber of per,T,ulati<x,s cannot be represented in a single library ot DNA ^^^"^'"^^ 
r^en^Si^t mu«iple cycles of DNA shuffling and selection may be required depending on the length of the 

[orre^otiTeTcarnr^^^^^^^ 

™:;^,'TetrpreX=leo.ide Which may be used in themes 

4 6M202 and4 68Tl9S or other amplification or cloning methods. However, the removal of free primers frc«, the 
pfR^lc"betor!trI?ln.ationprovldesamoreefficientresu«.^^ 

lofo^™ r:rempl"^^^cle~ nen should be doubie-stranded. A double.tranded nuclei -id molecule is 
^enrretat'eUo of the resuming single^tranded nucleic acid fragments are complementary to each 
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nucleic acid fragments havmg a region of f^'l^ ^^^^^^^ ^^pared to the total nucleic acid, more 

Sr'whetamixture of differentbutrelated template polynucl^^^^^^^^^^^ 
Ln.s\r».^c.o,t.e.em^^^^^^^^^ 

rtrn^^rde^irrsrr^^^^^ 
rrr-c^reX^^^^^^^ 

the nucleic acid fragments include pressure (36) and pH p„ferablv the temperature is from 20 'C to 75 

[0162] The nucleic acid f mgments '"^V ^« ;«f ""^f/.^^ j^C^of c'osT<S,ers is needed based on an 

•C, more preferably the temperature ,s from 40 ^ CJa high ^^^^^^ temperature, 

srprrsrb^rre^dST^^^^ 
?^^rB=rcSi^rrrr=^^^^^^ 

L~crro^= 

concentrations of salt and/or PEG can be used, if desired. polymerase and 

Ko~-^di;^--"- 

ase or any other DNA polymerase known in the art. minimum dearee of homology that should still 

[^'^^rreto^merasemay be added to the randomnucleic acid fragments ph. 

annealing or after annealing. ' „„eonca of oclvmorase can be referred to as 

[0167] The cycle of denaturation, renaturatK« and .ncubation p,e,srablv the cycle 
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cane., homologous recon,bm.ion can be used ir ccnbina.ion with or in place of de.erminis.ic recombination, such 

same size as the template polynucleotide ,n tandem. Th, '^'^^^ ''^'^^^^"^^^^^ o. approximately the same 
0, the temp^te 

size as the template polynucleotide. The population ^ ,„ template polynucleotide 
acid fragments having an area o. identtty and an ^^l^ ^,Z^^opo,^t<o^. lipTfection, or the like) 

be used lor transformation of a host cell, or the like. . ^ ^ j^^jg used to transform bacteria. 

[0170] ThesefragmentarethenclonedintotheapproprialevecterandW 
0171 Itiseontemptetedthatthesinglenucleicacidfragmentsmayb^^^^^^^ 

Lid fragment by ^-^'^'^^^"f^^^^^^^^ .^e con- 

101721 The vector used for Coning is no. <;;;^=^« PJ'^^^^ and translatton 

r r;r st rs^r:^^^^^ :re-"- - - °- — - - 

Preferred vectors include the pUC series and the pBR P^;':;^^.^^^, fragments having random mu- 

desired, the proteins expressed by each ^9^' "^^^^^^^ chromatography). If a DNA fragrrient which 

the desired properties onto the protein. u .„^„h=.n«Hknlav svstem in which fragments of the protein 

10175] "tiscontemplatedthat<Mneskilledintheartcoulduseaph^^^^ 

Lre expressed as fusion proteins on the phage surface '■''''^"^^^^Z^^alJ^Z^in. a portion of whid, is 
are cloned into the phage DNA at a site which results '"J^^l^^'^^J^^.Zl acid molecule undergoes 

r.i^=s.^^n-=c=^^ 

co^binant protein. In this manner, proteins wrth even j',,^'"^^^^^^^ „i,h a mixture of 

Kpe^^rarrr^^^^^^^^^^ 

acid Shuffling in order to remove «"V.='^"' -"^"""^ J'^^ll^i^S^^ nuclei acid. Thus the process may 
10178] Any source of nucleic acd. ,n P^^^i^f 'J, ^" ^^."''^^^^^^^^ or double stranded. In addition, a 

employ DNA or RN A including messenger RNA, wh«* DNA or ™A may be s«ig ^ ^^^.^^ 

DNA-RNA hybrid whfch contains one strand "J be "^^^^^^^^^ 

;:rm%rsrs::rr:Trre^^rh=~^ 
°;;rnr:Lry::rrr^^^^^^^ 

DNA ir RNA or from natural DNA or RNA from any sou«=9 ''^^ ^^'^^f^^^^^^^ 

such as plants or animals. DNA or RNA may be extracted from blood or ''"^ ^'g™ ;,683,202 and 

may be obtained by amplif^tion' using the P°'y-;«°'' ^"^^^^^ 

r.:!s^rc;trrg.c=^^^^^^^^^^ 
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;S zxz sxssz .» "»"T'»Tr r:??„z. 



not increase. 
Parallel PGR 



size increases with the number ot PGR cycles. amolified Whole genes and whole 

[0187] By using multiple primers in parallel, sequences .n excess o1 50 kb can be ampimea. 9 

ToiMl t st^TpCR^'L caned DN A shuffling, parallel PGR is used to perform in vKro recombination on a pool d 

mmmmmim 
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many cycles o. PGR because only hart of the annealed pairs have ex.endat,.e overhangs and the concer,tratior, ot 3' 



ends is low. 
Utility 



rotSOl The DNA shuffling method of this invention can be perlomied blindly on a pool of unknown sequences. By 
E to the reassembly mi^ure oligonucleotides (with ends that are homologous to the sequences being reassem- 

rai921 Shufflinq requires the presence of homologous regions separating regions of diversity. If the sequences to 
cln be usSTo create scaffold-like proteins with various combinations of mutated sequences tor binding. 



In Vitro Shuffting 



S^avaiVaS of Xient sequence data. On the other hand, random digestion of the genome with °NAseK tdlow«l 
lilflT'^^ po^nucleotide to be shuffled can be fragmented randomly or non-random^, at the discre^ 



titioner. 
55 In Vivo Shuffling 



[01981 lnanembodimentofinyivoshufflin9,them«edpopulationofthespecificnucleica^^^^^^^ 

into bacterial (e.g., Archeaebacteria) or eukaryotic colls under conditions such that at least two different nucleic acKJ 
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or» nroconr in each host cell The fragments can be Introduced Into the host cells by a variety ot different 
sequences are present .n each host c«l^^ 3^ g,3^p,3 t^e^.^e^t 

or«lTSl in *is embodiment, the specific nucleic acids sequences ™il be present ,n vectors wh«h a^e 

onVa sunrent n^m^ o1 sequences need be cloned into vectors to ensure that after introductKin of 'ho <ra9m«nte 

acids sequences cloned into vectors, this subset may be already stably integrated »ito the host ce I 

mimi It has been found that when two fragments which have regions of identity are inserted into the host cells, 

SarsrLombilaiono^^^^^^ 

n^e!cfcidsequ~ 

src:a.tr,o3^r^^^^^^^^^ 

ac^d seauences are present on linear nucleic acid molecules. Therefore, in a preferred embodiment, soma of me 
so^rnucTet: acTd s^uences are present on linear nucleic acid fragments. In an embodiment, the nudeic acid 

the advantage that the M1 3 virus can be shuffled in prokaivotic cells (e.g., E. coli), and then used to transfer DNA 

[^mT' Wtenranslormatton the host cell transfomiants are pteced under selection to identify those host cell trans- 
SntsSontirr^u^^^^ 

rs^ce^t »^*ug is desired then the transformed host cells may be subjected to 
Slhr^rticuteirdr^g and thL translom.ants producing mutated proteins able to confer increased drug resistance 
w I be «cl It me enhanced ability of a particular protein to bind to a receptor is desired, then expression of the 
Zer<^ betted from thetransfLants and theresulting protein assayed inaligan^^^ 
kntwn h me art to identity mat subset of me mutated population which shows enhanced binding to me ligand. Alter- 
natively me protein can be expressed in anomer system to ensure proper processing^ 

MMl once a subset ot me first recombined specific nucleic acid sequences (daughter sequences) having me de- 
sired characteristics are identified, mey are men subject to a second round of recombination. 
^ in me st^nd cycle ot recombination, me recombined specBic nucleic acid sequences may be ™xed "im t e 
Si mutated specific nucleic acid sequences (parent sequences) and the cycle '^^^^ ^/'^^^^'^.'^ 
m s way a s; of second recombined specJic nucleic acids sequences can be identified ^'^J'''''l^'^J'''^^J: 
acterilLrenlxJeforproteinshavingenhanced properties. This cyclecan be repeateda^ 

tts also corterSplated mat In the second or subsequent recombination cycle, a ^^"'^l^^J>'^'°2t 
A^icu ar backcross can be perfomted by mixing the desired specNic nuclei acid sequences wrth a ^^ "^^ 
of rJt^Wtlprsequence, such mat at least one wild-type nucleic acid sequence andamutatednud^^^ 

arrBresennrmeLe hist cell aftertranstormation. Re«^^^ 

Z eWnale t!l:>sri^utral or weakly contributory mutations mat may affect unselected characteristics such as immu- 

srrneremC^^^^^^^ 

Si aSuencTsI^ "e fragmented prior to introduction into the host cell. The size of the 'rag-^f « ^« 
C er^uah 10 contain some regL of ktentity w»h the other sequences so as to homologously 'o^^" '^^ 
large enougn lo ^ ,„„„.,. ssDNA Of dsDNA can be coated wim RecA in vKrg to promote hybridization and/ 
:rilgrrn^I"hSrTh 5^^^^^^^^^^ wm range from 0.03 Kb to 100 kb more pre'-bV 

oT^ri Stt^ttis also contemplated mat in subsequent rounds, all of me specific nucleic acd sequences omer than 
m fe nces Lt"fr^reTevk,usroundma?becleav«^ 

■Cleavage- may be by nuclease digestion, PCR amplificaUon (via pamal extension or stuttering), or omer suitable 

reifcsriirdT-^^^^^^^^^ 
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u u. ■ I „../»n hufraeradicals metalions acidtreatmenttodepurinateandcleavsl.Ftendwnfragmanls 
can also be ^t^'"^/ '^^^ 3^^^^^^ .ha use of restriction enzymes. The fragmented sequences can be 

'^^tx^"d:^c,douJ^sTr^did «^ 

s|ngle-st anded « dout^^^ ^^^^^.^ ^^.^^1^ ,^ ^^^^^^ „^ds 

tSSal"leng.hcopiesotaparentalsequencecanbeused.oeffect-.ra 

Scail be achieved. A^er a certa^ number of cycles, all possible mutants will have been achieved and further 
S""fn"mb"c^iment the same mutated temp^te nucle. acid Is repeated^ recombined and the resulting re- 
rT^lTSl^^m":!:::^^^^ ™<ated .emplate nucle. acid is cloned into a vector capable o, 

also preferred that the vector contain a gene encoding for a selectable marker. 

fMlll The population of vectors containing the pool of mutated nucleic acd sequences '^^ 

Jells Te vector nucleic acid sequences may be introduced by transformation. transfectKX, or '"'«=t'°" *b 
n«l^rhao7The^ncentrattonofvelrsusedto transform the bacteria issuchthatanumberofvect^ 
nto elh^elt once pre^^ cell, the efficiency of homologous rec»mbina.ion is such 'hat •.omok,gous reccm- 

toafi^o^tsbXeen the vartousvectors. This re^ 

srrh"::^'" 

tmia Once a oarticular pool of daughter mutated nucleic acid sequence has been idenlified which 

Schai^cttisfcsmenucl^ 

nu lL^Smrr^i^edwUh itself or with slmiU 

sequTncIs can be used for subsequent rounds of recombination, in place of or in addrtkx, to other selected daughter 

[miT It has been shown that by this method nucleic acid sequences having enhanced desired properties can be 

frnTa'^ ln an alternate embodiment, the first generation of mutants are retained in the cells and the first generation 
L^JL sMuences areaddedagain to the cells. Accordhgly, the first c^^ 

"^rarthfShmr nucleic ac« sequences are identified, the host cells containing these sequences 

mJtiU is contemplated that in subsequent cycles, the population of mutated sequences which are added to the 
preferred mutants may come from the parental mutants or any subsequent generation. 

iMi m In an alternative embodiment, the invention provides a method of conducting a molecular backcross ot ne 
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.suiting— an,sarep,acedunder.he^.ess,.^^^^^^^^^ 

S2r-;;!«7.rr:emr om^^ can .a used in a molecu.r bacKcross .o e,^.a.e unnecessanr, weaK^ 



contributing, and/or siient mutations. 
Utility 



10^3, Thelnv.or.co— n,a«,cao,U,is— ^^^^^^^^ 
Ullelesofaspecinc nucleic acidtr^^^^^^^^^^ 

which share one or more recombinogenic sequences l^^' °; ^ ^ 3 homologous recombination -hotspof 

:rzreSnirr;rrr^^^^^^^^^ 

DNA or RNA sequence of the specific nucIeK: 3='^ ''^gment ^^^^^^ generation of 

[0224] Theapproachofusingr^ombi^^^^^^ 

any useful proteins, for example, interleukin , ^^'^^'^^^ ^e useful for the generation of mutant 

generate proteins having altered specificity or « J^' ^'^^^Hihance. sequences, 3' untranslated regions 
Nucleic ac« sequences, for ^'^<'\^''^^1^^;;''^%T^ ,0 g^erate genes having Increased rates of 

:;~Trap;^r^rra-^^^^^^^^^^^^ — 

be useful to mutate ribozymes or aptamers. particularly suitable for the methods 

1022S] Scaffold-like regions ^^'f';^ZC^°^'ZZS "^^^'^^ '^^'^^ 

Of this Invention. The consen/ed '^'T f * ' ° Z o^^^^ scaffolds are the immunoglobulin beta barrel, 

rt^rs^rrm'^^i^^^^^^^^^^^^ 

For example, a -rr^lecular- backcross can '>^P^'^°"^'^^^J^^^Z,\ breeding, this approach can be used to 
type nucleic acid while selecting tor the """'^''^^ "J'^roL o chofc" ^r example, for the removal of 

Peptide Display Methods 

. to shuffle bv in vitro and/or in vivo recombination by any of the disclosed 
[0227] The present method can be used to shuffle, by ^ bT^tide display methods, wherein an asso- 

bi;rerrc^ra"disr^^^^ 

STl^Syimportantaspectofblopharmace^^^^^^^^^^^^^ 

lifi Jonotpepridestructures, '"'=l"'''"9''^tf:T^7^rv°nT.ep^^^^^^^^^ a «""<='"^« °' '""'='""^' 

wrth biologfcal macromolecules. One method of ^ ^ receptor), involves the screening of a 

l^'T-dTr^r—xr^^^^^^ 

Ss also have been reported. a,e type invokes the disp^^^^^^^^ 
the surface of a bacteriophage part^le or eel Gener^^^^^^ 

asanindMdualllbrarymernberdlsplayingasingle pecesoid^^^^^ ^^^^^^ 

filamentous bacterk>phage, typically as a fusion ""^^'^^"^^^^ZIZ l«-9- ^ =° 
ir,cubated with an immobilized, P^^^f^^^'^^^^J^^"^'^; ^^^^ 

"p^rn^f^ tre^rnTrrp^-;!'^ - - — 
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bacterioohaqe partides (i.e., library members) which are bound to ihe immobilized macromolecule are ^^"'^^'^ 
a^d riSSto amplity the selec ed bacteriophage subpopulatior, (or a subsequent round of affinity ennchment and 

wZlly comprising a variable sequence, that is available for potential binding to a predeterrr^ned macromo ecule^d 
,0 a second Xeptfde portion that binds to DNA, such as the DNA vector encoding the indiv^ual fusion P™ « When 
uansformed hL celb are cultured under conditions that allow for expression of the fusion pro ein, the fusion protein 
t^Tnds to the DNA vector encoding it. Upon lysis of the host cell, the fusion prot6inA<ector DNA complexes can be 

^r ened agi^Tsfa^^^^^^^^^^^^ - ^ ^ '? :Lr:::>tein/ 

rphage b?sed display system, with the replk^tion and sequencing of Ihe DNAvecto 

IS vX DNA complexes sending as the basis for kientification of the selected library peptide sequence(s). 

S 0»,er ojrms for generating libraries of peptides and liKe polymers have aspects of both *«2!:?rL 
Lnd^ vitro Chemical synthesis methods. In these hybrid methods, cell-free enzymatic machtnenr is ^W^*"^^^ 
^^^isTThe in v^ro synthesis of the library members (i.e.. peptMes or po^nucleotKles). n one ^l^^J"^^'^^ 
Siules with SilibiMy to bind a predetermined protein or a predetemiined dye molecule were selected by aBernate 

20 Inds of ee e^«^^^^^^ ampMicalion (Tuerk and Gold (1990) Ssgnce 249: 505; Ellin0on and Sz<»tak (1990) 
Natu^ 1~A simi^r technS,ue was used to identify DNA sequences which bW a P'-^"^^^^!;. f^'. 
Sro^lTctor (Thiesen and Bach (1990) N.ictoic Acids Res. 18: 3203; Beaudry and Joyce (1992) Scipce 2§7. 

pept^e sequence; such methods are suitable (or use in celHree in v^ro selectKin ^8 !*^2ombination 

3C ?M331 A variation o( the method is recursive sequence recombination per(ormed by intron-based recombination, 
Sn the Z ences to be recombined are present as exons (e.g.. in the (orm o( exons. whether "a.ura Wrnng 
Tr aTcial) S may share substantial, little, or no sequence identity and which are separated by one or more m^ons 
?^^h rmv be naZlly occurring intronic sequences or not) which share sutficient sequence rienlity to support ho- 
^i™c^b Setween introns. For example but not limitation, a popub«on of P°ly""f "'"'^^^ ^JP^^ 

36 toaXemb^rs wherein each library member has one or more copies ol a first set of exons linked via a first set <rf 
.^sTo one or more copies of a second set of exons linked via a second set of introns to one or more cop es c^^ a 
h ^se o°exons Each of the members of the first set of exons may share substantial, IMIe. or no sequence identity 
w h 2lh o hTo, J,, members of the second or third sets of exons. Similarly, each of me members of he second 
set otexons may share substantial, little, or no sequence identity with each other or with members of 'he f-'st °' 

40 se e d exons Similarly, each of the members of the third set of exons may share substantial, little, °' "°;e^"«"" 
dS w^ each other o, with members of the firs, or second sets o. exons. Each ol the "^^'"''f ° f^^^!^^^;;^ 
(first second third, etc.) of introns shares sufficient sequence identity with the other members of the ^et to support 
rSnati«, (homologous or site-specifk: recombination, including restriction site-mediated recombirjato^) between 
m^e s oMhe same fntron set, bui typically not with members of other intron sets (e.g.. the second or third se^s^ 

« Z in ra set rTcombination between introns of the library members occurs and generates a pool of recombined 
ftorymelers Wherein the first set of excxis. second set of exons. and mird set of exons are effect«ely shuffled with 

TOWr'^e displayed peptWe sequences can be Of varying lengths, typically from 3-5000 amino ackJstong or 10^^^^ 

KntlyTom 5-1 00 amino acids long, and onen frcx. abou, 8-1 5 amino acids '-9; * J^^^;,^,™^^ 
50 members having varykig len0hs of displayed peptide sequence, or may compnse library members ^^"""9^ "^ 
^nX»yed wMe sfquence. Portions or all of the disp^yed peptide sequence(s)can be random, pseu^- 
aX, deSset^e^^^ fixed, or the like. The present dispby methods include methods tor jn vitrsand B vffig 

large-scale screening of scFv libraries having broad diversity of variable region sequences and binding 
55 MSsrThe otesent invention also provWes random, pseudorandom, and defined sequence framework pep ide li- 
SsandmSo grnerating 

siZtThl arrt^todies) that bind to receptor molecules or epitopes of interest or gene products that modify peptides 
or RNA in a S fa hion The random', pseudorandom, and defined sequence framework peptides are produced 
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oligosaechande, and the like). selection round (typically by affinity selection for binding to a 

'To. :T:Z% bTa™ mil : p^n .be pl.^i;s, ll: sbu,«ed by . vuro and/o, in vbic 

?^T^rnrrarpTr::r=ro^^^^^^^^^^ 

L Jing a peptide having) a predetemtined ^^ing specificrty are formed ^^^^^^ 
peptide or displayed single^hainantibodylibrarvaga-^a^^^^^ 
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enrichedfromthe library to generate a shuffled '^J^^T^^^^ J^^^^^g a^d/or er,riching shuffled library mem- 

rrtrhi;:s:rj^ 

Antibody Disp lay and Screening Methods 

. . K . .e«H tn chuffle bv in vitro and/or in vivo recombination by any of the disclosed 
[0243] The present method can be used to shuffle, by bTIrtibodv display methods, wherein an asso- 

resented by the extremely large number of f '"^'^^'!""' 1^*^31^ 
ThenaturalV^urringgermli-^eimrnurv^obu,^^^^^^^^ 

(V) segment genes located upstream of a '^'^^'^/^^^ ^psueam of the constant (Ch) region genes, 

upstream of atandem array of io,n,ng(J) '^^'°"^^"^^'tZ^Z>lt^o^ <^'^ variable region gene (V„) « 
During B lymphocyte devetopment, V-D^ '^f^^^lZ^^^Z^nZx with a V segment to form a V-D-J 
fomted by rearrangement to fomi a fused segment '"""^^ ''^ ,3,^,,, ^gion (V„) of a heavy chain. 

Latorial possibilities 01 jniningVand segments (and^^m^^^^^^ 
menthBceli development. Addi.tonalsepu«,oe 

rearrangements of IheDsegmerits during V-DJ pining andtiwnNr^^^ 1,^1,, 

B cell cLes- selects for higher affinij, ^^^^J^^^^-J^^^^^^^^^^^ "f 

^rsCerr^"^^ 

St^cZ;::rc^emany.thelimnation^^^^^^^ 

antigen-stimulated B cell development (i.e., '"""""7""^);," itoSvSi^ may be screened for high^tfinity 
oped that can be maniputeted .0 produce ^^'"^^^'^l^^^^'Z^^^IesM^ 
antibodies to spec«« antigens. Recent advances^^^^^ 

K'SnSres^^ti-ieshave^ 

may be screened as bacterbphage piques or as cotonies^ysogsfrl p^-^TAcadScyiiMi 
and Koprov-ski (1990) Prnr NRtI '^^■^i,f^u^\ ^ 2^). various embodiments of bacteriophage 

97: 8095; Persson et al. (1991) Prnr Natl ^^^J^J^ ^,,^,^^3 i^e ^'„ described (to^^ (1991)ProcjaatL 
IJSibody disp^y libraries and lambda f^^^ZTf^u^^TeTJSZtie^t, et al. (1990) Nature 348: 552; Burton 
Acad Sci. (U.S.A.) 68: 4363; ClacKson e. al. (199 )^^^ M„.l«ic Acids Res. 19: 4133; Chang 

etai. (1991) Prrv Natl Acad Scj- 1^ S.^jM. 0 34. Ho°9«^^^^^ MarkUtal. (1991) Jjkm Siil 58': 
e.al.(1991)iinB!ffi0Ll£l 3610 8^^^^^^ 

Bart)as et al. i^^) S!SSJM^^^^^^^ T^^^mL 267: 16007; Lowman et al (1991) BiocherffiStDi 

phy) and/or labeled (e.g. , to screen plaque °f =o[°"V lifts)_ single-chain fragment variable (scFv) 

[0248] OneparticuiarVadvantageousapproachhasMen^^^^^^^ 
ibraries (Nterks et al, (1992) ffiot^SD^ JO: M 

,....,^.1. . M.r.,.,e.al. (1991) J. ggL 581 . Chaj^dtenr '.- , p,^ 'm.., 'Acad. Sci. 

Chiswel^i. (1992)I!BIECH 10: 80; ^^Calferty «t al ^^^^^^^^^ 
iUSAlgS: 5879). various embodiments olscFvlibranesasp^^^^^ 

i^reeginnins i" ^988. single^hajn 7^^"^ obtaining the genes encodhg V„ and V, 

r^nrhri!eS:^c^rs!thr:trs^^^^^^^^ 
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u- * • . V/ «.n« lihrnrv or made bv V qene synthesis. The single-chain Fv is formed by connecting the 
both the affinity and specificity of its P^^«"'!'« "J^^^s linked Into a single polypeptide chain by a flexible 

immwm 

mwmmm 

,0 .hibit HIV-1 Virus replK=a^,»nB^o D^^^^^^^^ ThLSIfei^i^ oocytes (Btocca et ai. 
Fusion proteins wherein an scFv IS linked to a seconapoiyp«H«'. /iqq3\j Biol Chem 268:5302). 

wmmmm 
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. nr «n« surtablG antibody display system wherein the antibody is associated with its encoding nucleic 

on polysomes o any surtab^^^^^^^^^ 

[^sm "por generating diverse variable segments, a coilection of synthetic oligonucleotides encoding -^ndorn^eu- 
L Ll lr rd«fined sMuence kernal set ot peptide sequences can be inserted by ligation into a predetermined 
o a CDRf S^ter ^DRs of the singleK:f,ain antibody casset.e(s can 

rS^otpSed peptide/polynucleotide complexes (library members) which encode a var^ble segment pepjide 

SeiT^^srarmsr^ns^rb:^^^^^^^^ 

c^i»Lr^fc«ToTnitles ot some peptides are dependent on ionic strength or cation concentration. This « a useful 

fh:::ci::rs!"o^3osmafwreu'^^ 
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•. u Ki^in„^«,riii«is Tha collection Of beads, comprising multiple epitope species, can then be used to isolalo, 
suitable binding MndilKxis T^^^^^^ subsequent affinity screening rounds can include the same 

rnietdT'b stste^oTorbTrcon^^^^ 

en. crefn ng'and s compatible with laboratory automation, batch processing, and high throughput screen.g 



r^wT A variety of techniques can be used in the present invention to diversify a peptide 

in^teJdv Irarv or to diversNy, prior to or concomitant with shuffling, around varabie segment peptries or Vh. Vl, or 

S^niTin^ry rounds of panning to have sufficient binding activity to the predetem,ined macromolecule or 

jr« js;.^^™ ~ ~ 

To and/of in vivo and one or more additional rounds of screening is done prior to sequencing. The same general 
i^r^rS^mployed with single-chain antibodies in order to expand the d^ersrty and ^^r^^^^"^^ 
aSfp^cHy typical^- by diversifying CDRs or adjacent framework regions pnor to or concomrtant wth shuffling 
fdesH^SXctions can be spiked with mutagenic oligonucleotides capable om^ ^^'"^'f' °" ™ 
selL^^librarTm^^^^^^ 

of hTgh ri^fty metSs) can be added to the invitro shuffling m« and be incorporated into resulting shuffled library 



ro^M] The p e^n invention of shuffling enables .he generation of a vast library of CD"--iar« s^--^a^ ant. 
S One to generate such antibodies is to insert synthetic CDRs into the single<ha,n antibody a-^^o CDR 
^omW ilr to or concomitant with shuffling. The sequences of the synthetic CDR cassenes 
irrto^ow, seauence data of human CDR and are selected in the discretion of the practitioner according to the 
,o lorrVaulTesT^^^^^^ will have a, least 40 percent positional sequence iden.^ to known CDR sequenc- 

following gu d^ines^symne sequence identity to known CDR sequences. For 

:rpt^ tc^^I^n^ ^^^^^^ can^be generated -J^^V-.^ar;^^^ 

.Lc ^ tho h^QiQ Of naturallv-occurrinq human CDR sequences listed in Kabat et al. (1991) oexit. tne poo\{B) 
ZZ c CDR seq nli Z encode CDR peptide sequences having at least 40 percent sequence 

ln,rtoatfeas"onek^ornrurallyK«ur^ 

cTRTwueres Z be^^^^^ to generate consensus sequences so that amino ackJs used at a rescue pos*ion 
^equenTn »,Xtttpercent of known CDR sequences) are incorporated into the synthetic CDRs a, he corre- 
lLino DC«iti.^(s) TypicaHy several (e.g., 3toabout 50) known CDR sequences are compared and obsewed natural 
re^ue cevarS bXrJ^^^^^^^ 

firseauences ^compassing all or most permutations of the obsen,ed natural sequence vanations is synthesized 
Fo ^xa^f^e but nc.Tr limitation, if a collection of human V„ CDR sequences ^-f.^^W™^' 
u t T„r,hlr T„r \A.l Phe or AsD then the pooHs) of synthetic CDR oligonucleotide sequences are designed to 

rZt cSr^a^^^^^ 

fhose^ichn« occur a. a residue posWon in the collection of CDR sequences are incorporated: con^e 

ever usual^ oTmo^e ^an about 1 x 1 0^ unique CDR sequences are included in the collection a mough occasiona y 
1 X o" to X iS" unique CDR sequences are present, especially if conservative amino acid substitutions are permitted 
I, pli^s vTreTe consenratL amino acid substituent is not present or is rare (i.e., less man 0.1 percent) in that 
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position in na.ura.lv^cu.ing h..an COR. 

generalV should no. exceed '^-''P-'^f "^^'^''^^^^^ wim an affinity o, atom 

variable region encoding sequence may be isolated ^^ ^ •^f^^^^^'"^^^^^^^ Zo suitable for human 

quenceencodingadesirc^---^^^^^^ 

rS^Zn'tTcTnre?^^^^^^^^ 

i^^r-^Torr^rconstr^^^^ 

4.704.362, which is incorporated herein by reference). ^^^^^.^^ cell culture may also be used to 

r0269] In addition to eukaryotic microorganisms such as yeast, mammalian ' ^^""/^^^^ ^ Y 

p^red^'r^rSot^^^^^^^^^^ 

L™ciLc,ings^,uenoeso,be.ween,0.o300bpthat™^^^^^^^^^ 

increase transcription when either 5- or 3' o the ;^,|^^„ ^renhancers, cytomegalovirus 

conferring drug resistance. The first two marker genes P ^'^J *^ "'^ ° by their ability to grow 

ferring resistance to G418. mycophenolic acid and hygromyc.n. ..^^^^^ j^^^ ^^e host cell by well-known 

electroporation. and microinjection (see, flenerally. Sambrook et al., syfira)- 
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othe, immunoglobulin po^pept,de5 of '''f."^ ^ '^^"^^^^ the like (eee, gererally . 

demic Press, New York, N.Y. ^^f^^^l^^^. . ^. i^^entlon can be used for diagnosis and therapy. By 



Yeast Two-Hybrid Screening Assays 



,0... Shufningcanaisobeus.. 

riceirr^-^^^^^^ 

from another protein species). nredatGrmined polypeptide sequence has 

[02761 An approach to identifying P^'VP^P''^?, "^^^ in a fusion 

been to use a so.a.led .^-hybnd- system v^ore^^^^^^^ 

protein (Chien etal. (1991)P,rnr Natl, Acad. ^'-^^^^^'^^^^J^l^So^^ O (1989) Nature 340: 245), the yeast 
fnavo through reoonst«utton of a •'-"^"P^""^'^' !f ',"1^^^^^^ which consists 

Gal^Transcription protein. Typically, the methcd 
ofseparabie^^^ln^on^^^^^^^^ 

proteins, one consisting of the yeast t.a» uin« uimu a „„iu-aDtida seauence of a second protein, are con- 

and the other c^sisting of the GaH activatior, '^^'^^^^^^'^ZZ the two fusion proteins reconstitutes the 
structed and introduced intoayeasthostce '".^"^"f;," .'^^.^^'I^^^^^ actuation of a reporter 
Gal4 DNA-blnding domain with the GaW activalon *ma n «h'ch leads to the p ^ 

gene (e.g.. <acZ, H/S3) wh^ T/SinUr c, wfth a ^otnCtein SC and Hum SW (1993) MoLBjoL 
identBy novel polypeptide sequences which "J'';™^ ^19921 science 257: 680; Luban etal, (1993) Cell 

neD.17: 155; Durfee et al. (1993) GsQes£|i^Z, ^^:'^^2u^9S2)^^^Il 920; and Vojtek et al. (1993) 
W^l. Haidyetal. (1992) GenesDeve!. 6; 801, Bartel^^^^ 

Cin 74: 205). However, variations o, '^^^ l7^Tf ASEBT7 957-. Ulo e. al. (1993) Proc. 
that affect its binding to a second known P;° « " f^^ J^'^^^^^ 3 l9^i;^^Saduraetal. (1993) J. BiolXbaEL 
M«.i Acad. Sc.. (USA) 90: 5524: Jackson et al. (1 d»,,ains of two known proteins 

268: 12046). Two-hybrid systems have also "een used to den i^^ XTj Bio Che,^^^ "498; Staudinger et al: 
ISJdwell etaM1993)rnedJ4teI2b!^£ 11": Ch«k^^^ 

li993,Mio!^2^f°S;-^;^^^^^^^ 

lor otigomerization of a single protein (I wabucht et al. (i aHJj yncoH^ nroteolvtic enzyme (Dasmahapatra 
Var^t^softwo-hybridsystemshavebee^^^ 

et al, (1992) ProcJMSi^£!ji^S^ o..^ ,u..i Ar^ri Sc.. (U.S.A.) 90: 1639) 

mino et al. (1993) Proc Natl Acad. Sci. ^ AJ 9a 933. Buarente l t , heterodimerize or form higher 

consensus sequence(s) and consensus sequence kemals. 



Imorovemenis/Alternativ e Formats 
Additives 



[0277, in one aspect, the improved shufHing meth«. i-.ud^ .^^^^^^^^^ adreT^ h 
me rate or extent of reannealing or recombination °' ^"^^^^^^^ substantially 

=irrr(rrgr=or^^^^^^ 
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M + .nH/nr vf+ inn coficentration) can modulate the relative stability of mismatched hybrids, such that 

StraLi of 0. 1 to 25 percent, otton to a final concentration of 2.5 to 1 5 percent, to a final concentraUo, ofabout 
rpetentTn an embodiment, the addit^e is dextran sulfate, typical^ added to a shufning reaction to af^nal c<x^cen^ 
tratro. 0. to 25 percent, often a. about 10 percent. In an embodiment, the addit»,e is an -^J'^-^ ^"'^ 
s~e specificity of reannealing and promotes promiscuous hybHdization 3"<^\'«~""""f f,,^' 
Lma^ve emt^lment the additwe is an agent which increases sequence specificity of reannealing and promotes high 
'ZX^^^^ recombination'jn Other long^hain po^mers which do not interfere with the reaction 

r4^rin:n:asre^.'K^^^^^^^^^ 

Sen EZpTero, suiZe cationic detergents include bu, are not limited to: X''^^^^^'-'^';;: 
irTARl dodecvltrimelhvlammonium bromide (DTAB). and letramethylammonium chlonde (TMAC), and ttie like. 
EmT' noXCheT:^^^^^^^^ 

n^ntnir nrntain that calalzves Or non-catalytically enhances homologous pairing and/or strand exchange in vrtra Ex- 
3 ottber^^^^^^ 

'Xrotein from W "«y^-. *er recA famiV recombinases from other -P«='«^^ ^^^"f,^^^^^^^^^ 
<ssm ribonucleooroteln A1 and the like. Nucleases and proofreading polymerases are often included to improve the 
S arofTl infegrity. Each of these protein addrt^es can themsehres be Improved ''V '"""'P'« «>""^^° 
Te^l^eTequlnce ?ecomttnatton and selection and/or screening. The invention embraces such improved add.t«,es 

and their use to further enhance shuffling. 



Recombinase Proteins 



abilitv to Droperly bind to and position targeting polynucleotides on their homologous targe s and tii) y 

bSn^Zr^^^^ 

i#£Trefar;^isrp^^ 
taL^rh'aSx^ss^^^ 

not limitatton: recA, recA803, uvsX, and other recA mutants and recA- like recomb^ases (Ro^ A^^^^^ 
Riochem.Molec.Biol . 25: 415), sgEl (Kotodner et al. (1987) Ef2£JM^^^^^§^^ ™ ^ 
Molec Cell. Biol. 11: 2593), RuvC (Dunderdale et al. (1991) Nature 3§4: 506) KEM1. XRN1 ^^^^ ^ 

"aPi^ l^lecCe irBiol . 11: 2583). STP«/DST1 (Clark et al. (^99^)i^2!S£;r£2!LB!|. JL 2^^^^ HPP^ 439^ 
Loi Br^y. M..I Ar/ri Tel. (U.S.A.) 88: 9067). other eukaryotic recombinases (Bishop s'/M' 992) Sell 
Ihinli rrei el iScell 69- 457) ~corporated herein by reference. RecA may be purrfied from £ cof, strains, 
Shinohara et al. ^ 992) ^!L|L -^'j ■ ^ ^i,^^^ University of CaWoma- 

« iCSlfc:^!^ r ^^'c^n^sequences on a -runaway- '^^-^jX:::^^^^::: 
^Tnar^mut^ul/^oTproteinm 

Lt. one monomer of recA protein is bound toabout3nucleotides.Thisproper^ofrecAocoat^gle^^^^^^^ 
ressentially sequence independent, although particular sequences favor initial loading of recA onto a polynucleotide 
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(Hoess and Abremski (1985 J Mol. biol ogi^^ i t sita-specific recombination systems 

« r:i!!g '^^^^^^^^^^ -ami^ o. recombinases is .ound in Argos et a,. ,,986) EMBOJ. 5: 433, ,0- 
corporated herein by reference. 

Exonuclease 



ases are: 
Bal31 

Bacteriophage Lambda exonuclease 
E. coli Exonuclease I 
E. coli Exonuclease III 
E. coli Exonuclease VII 
Bacteriophage T7 gene 6. 

Stuttering 



fan SS to and prime polynucleotide chain extenston from a second 

P7rant::oSi:ro,r ret^od, ~ o, secuence^ecom^nod polynucleotides are generated 
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from soauonce-rolated po^rnuclaotides which are naturally-occurring genes or alleles of a gene. In this aspect, at least 
Zo^lZ^^^«^Z^es and/or alleles which comprise regions of a. least 50 c«isecut«,e n"cl~t;des ^ 
have rtTr rJ^rcenrsequence identity, prefereab^ at least 90 pe«:ent sequence identity, are selected from a pool 
ofLe s^uences such as by hybrid selection or via computerized sequence analysis using sequence data from a 

labaseXT^it^^^^^^^^^ 

fled bv anv of the various embodiments of the invention. 

S "n an embodiment of the inventtor, the method comprises the further step of removing ron-ehuffled products 
fg pa en^l s^uences) from sequence-recomblned polynucleotides produced by ^ 
mettiods Non-shuffled products can be removed or avoided by pertom^ing amplification wrth. ( ) a first PCR Pnmo 
which hybridizes to a first parental polynucleotide species but does not substantially hybnd^e to a seccxid Parentel 
S^^^ynucnidrsp^ies, and (2) a sec<i,d PGR primer which hybridizes to a second parenta polynucleotide species 
Ss not subs^ntially hybridize to the first parental polynucleotide species, such that amplification occurs on from 
ml^L clmp'^^^^^ of the first parental sequence which hybridizes to the first PGR primer and also com- 

p'^ng the ^In of the second parental sequence which hybridizes to the second PGR pnmer, thus onh, sequence- 

[sr^^ntrnironS^^ 

L 9 g genes) lack satisfactory sequence similarity for efficient homologous recombination or for 

priming fol PGR amplification, intermediate (or "brkiging-) genes can be synthesSed which share ^"^^^"^ 

Kwith the parental sequences. The bridging gene neednotbeactive or conferaphenotypeorselectablep^ 

« need provkJe a tempJate having sufficient sequence Wentity to accomplish shuffling of the P«'«"'^' ^ » 
^he tntem^eSiate homolo^ of the bridging gene, and the necessaiy sequence(s) can be determined by computer or 



Sf lhe invention also provides additional formats for performing recursive recombination in v,Vo, either in pro- 
Sc o^eu^r^otk: cells, "^ese fomiats «,clude recombination between pfesmkis, recombination -rus-, 
Te^bZL between plasmid and virus, recombination between a chromosome and plasmid or virus and in'ramo- 
^r re^^lnation (e g., between two sequences on a plasmid). Recursh/e recombination can be pertomted entire y 
^vi^o ^^n^ess' e rounds of in vivo recombination are Interspersed by rounds of selectKK, sc^^-nQ;^^ 
vivZm^Xs Jn also be used in combination wrth in vtoformats. For example, one can perform °n« Z^^™™ 
rtfinra round of selection, a roundot /nv/.oshuffiing. afurther round of seled^^^ 
and a further round of selection and so forth. The various in wVofomiats are now corsKJered in turn. 

fa> Plasmid-Plasmid Recombination 

[02931 The initial substrates for recombination are a collection of polynucleotides comprising variant t°™^ of a gene, 
'me vartent forms usually show substantial sequence identity to each other sufficient to allow homologous recornb na- 
Sn'^tweln substiates'The dKrersity between the polynucleotWes can be natural (e.g.. a^^lic or^sp^ v a^^ 
induced (e a. error-prone PGR. synthetic genes, codon-usage altered sequence variants), or the 
ecombinatton. There should be at least sufficient diversity between substrates that recombination can gahorate more 
*erse oroducts than there are starting materials. There must be at least two substrates diflering in at least two posi- 
tos ™f commX a 'ibra-V of substrates of 103-10e members is employed. The degree of d"'a's*y depends 
rthe™of th^^sLe beJig recombined and the extent of the functic^al change to be evo^,ed. D^ersity a. 

Sorporated into p.sm«s. The p^mlds are often standa^ 
rteLrrmunfcWPlasmids.However,ln some methods tobedescribedbelow,theplasmidsind 
The sublTraurcan be incorporated into the same or different plasmids. aten at least two dillereni types of plasmid 
JLlng d^ent types of selection marker are used to allow selection for cells containhg at least two types of ve«or 
a J whert dllent typeset pbsmkJ are employed, the different ptesmkJs can come from two ^"^Pf 
t^upr^rallow ILble ^-existence of tv«, different plasmids within the cell. Nevertheless, plasmids 'rc«n the same 
fn^ibX group can still co-extet wimin the 5 same cell for sufficient time to altow homologous recombination to 

PlasmkJs containing diverse substrates are initially introduced into cells by any transfectton methods (e.g., 
Sal ranXnTk^ Itural competence, Uansduction, eleCroporation or biolistk=s). Often, P'asm^s are 
p^em 1. or ne^I sa.uraling concentrat»n (with respect to maximum t-ansfectton capacrty) to increase the probabiirty 

ot more than one plasmid entering the same cell. ^ «^,o«^f^^fl 

The 0 asmSs containing the various substrates can be translected simultaneoush^ or .n multiple rounds. For example, 
in the rer^P^^^^^^ can be translected wKh a first aliquot of p^smid. transfectants selected and propagated, 
and then Infected with a second aliquot of plasmid. ^. , . ™«rot« r«rnmhinant nanes 

[0296] Having introduced the plasmids Intocells. recombination between substrates to generate recombinant genes 
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only one plasmtd are less ^^^^J^'^l^^^ . increased by allowing all substrates to participate in 

of variant genes cloned into a plasn^id. T^*^® '"''^^^ ^ u fZ cb\Is having taken up two plasmids. the plasmids 

^.STto the salacted phenotype, may lose .ha other plasmW. as shown ,n panel F. 

fb^ Virus-Plasmid Recombination 
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K"SusT:— Plasmid and virus generating bo,, reccbined p.s..s and 

SbineS For some viruses, such as filamentous phage, in ^ich intracellular DNA exists ,n txjth dout^e- 
S^^dingle-strandedforrns, both can par.icipateinrecombination,ProvWedt^ 

tails cells, recombination can be augmented by use o, eleCroporation or oonjugaton to transfer P'^sn^ls totwoe" 
celb Recombinatton can also be augmented lor some types of virus by allowing the progeny vims "^Z,^^ 
reinfect other cells. For some types of virus, virus infected^ells show resistance to supenn ectK»i. However, such 
resXcTin be overcome by Infecting at high multiplicity and/or using mutant strains of the virus ,n wh«h resistance 

ro^rt-rsu^oflrcllngplasmld 

such as filamentous phage, stably exist wrth a plasmid in the cell and also extrude progeny phage from the cell, aher 

"uses sucTas lambda Lingacosmid9enome,stabV exist inacelllikeplasmids without pr^^^^^^^ 

O her V ruses, such as the T-phage and lytic ^bda, undergo recombination with the plasmid but "I'^^^^l^ ' 

h^cell and destroy plasmid DNA. For viruses that infect cells without killing the host, cells =°"<^'"7 

Btesmids and virus can be 5 screened/selected using the same approach as for plasmri-plasmid recombination, Prog- 

yts elded by cells sun,Mngselec.ion/screening can also be col^^^^ 
rounds of recombination. For viruses that kill their host cells, recombinant genes result»,g f rom 
Z in the progeny virus, if the screening or selective assay requires expression of recombinant genes ,n a e the 
^Tiimbinant genes should be transferred from the progeny virus to another vector, eg., a plasmid vector, and retrans- 

tected into cells before selection/screening is performed. w„=(tN«=.nH 

03041 For filamentous phage, the products of recombination are present in both cells sunflving recombinatio.^ and 
np^lleen^tmthLeLlls.Th%dual source ofrecomblnantproductsprov^^^^ 

0 the plasmidiJlasmld recombination. For example, DNA can be isolated from phage partKles for use a rou"d of 'n 
l/r™rLmbinrti<x,.AitemativeV, the progeny phage can be used totrarjsfect or infec, cells su™^^^^^^^ 
of screening/selection, or fresh cells transfected with fresh substrates for recombination. In an aspect, the invenuon 
employs riombination between multiple single-stranded species, such as singie-stranded bacteriophages and/or 

[MoT Rg. 27 i""strates a scheme for virus-plasmid recombination. Panel A shows a library of variant tom,s of gene 
S into plasmid and viral vectors. The plasmkis are then introduced Into cells as shown in Pa"« B. Jl'al 
gen^rL packaged in vit«,and used to infectthecells in panel B. T7,e vi^ 

the cTas shown in panel C. The viral genomes undergo recombination v^th plasmid genomes generating he plasmid 
and viraHormtThown in panel D, 5 Both plasmids and viral genomes can undergo further rounds of replication and 
stmctures shown in panels E and F Screening/selection identffies cells =on.a«.ng pl». 
mTnrvirafgenomes having genes that have evoh,ed best to aitow sun,ival ol the ceH in the screeningfeelection 
pr«=ess. as sho^ In panel G. These viral genomes are also present in viruses extruded by such cells. 



Ic) Virus-Virus Recombination 



r03061 The principles described lor plasmid-plasmid and plasmid-viral recombination can be applied to virus-virus 
Sbinafcn w'h a lew modSications The initial substrates for recombination are cloned into a wa vector Usual^ 
ZZa vector is used for all substrates. Preferably, the virus is one that, naturally or as a result o' 
not m cells After insertion, viral genomes are usually packaged in vitro. The packaged viruses are used to infect cells 

i9m«rucMha.\hereLhlghprobability.ha.acell^llrecei^^ 
[03071 After the inftiai round of infection, subsequent steps depend on the nature of infection as discussed in the 



[0307] 

Frl°x"rple 1f"the vimses have phagemid genomes such as lambda cosmids or M13, F1 or Fd phagemWs, the 
Sagerds beLvl a plasmids wNhin the celLd undergo recombina«on simply by propagating the oo^^B^ 
S? particu^rly efiiclent between single-stranded forms of intracelluter DNA. Recombination •>« a Smen ed 
by electropoiation of cells. Following selection/screening, cosmids containing recombinant ^« '^^^ 

sunrivTng cells (e.g., by heat induction ol a cos" lysogenic host cell), repackaged ,n wtra, and used to infect fresh 

cells at hiah multiolicitv for a further round of recombination. 

O^iHthTviruses are filamentous phage, recombination of replicating form DNA occurs b^ 
Kted cells, selection/screening identifies colonies Of cells containing viral vectors having rec^^^ 
Imp^vXroperties, together with phage extruded from such cells. Subsequent options are essentially the same as 

t^T^^ ;"^"mple of virus-Virus recombination. A librae of d.erse genes is c^o-J into a l^b<^^ 

Lmi. The recombinant cosmid DNA is packaged in vKro and used to infect host cells at ^'9^ muH^^^^^ 

many cosmids bearing different inserts enter the same cell. The cell chosen is a cos- lambda lysogen. which on indue 
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(fl) Chromosnme-Plasmid Recombination 

ro310l This format can be used to evolve both the chromosomal and plasmid-bome substrates The '°n™' '^P^'" 

wmmMmm 

as shown in panel E. 

fe^ Virus-Chromosoo f^Q Racombination 

103141 AS in the other methods described above, the virus is usually one that does rot kill the cells ar.d is often a 
o^aaior^hialmid T^^^^^ is substantially the same as lor plasmid^hromosome recombma^^ Sub tra es 

orr^c^nl^te^loned into the vector, vectors includingthesubstrat^^^^^^^ 

substrates tor recombination can be introduced into the cells, either on plasmid or viral vectors. 
e. Evolution ol Benes bv Coniuoative Transfer 

r03161 AS noted above the rate of in vivo evolution of plasmids DNA can be accelerated by allowing 'rans'ar of 
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eniirelv for all Dumoses) Coniugation occurs between many types of gram negative bacteria, and some types of gram 
!^terteriacZgaVve transfer is also known between bacteria and p^t cells (AQrobactenumtumetacier,^^^^^ 
^st AsSuied i,^ (Spending application attorney docket no. 16528J^1 4612. the genes responsible to con,uga. 
tnsfer rthelselves be evoked to expand the range of cell types (e.g., from bactena to mammals) between 

r^TconrXTanXiseffectedbvanorig.^^ 

" es, termed'ra, encoding the structures and enzymes necessary .or co")"9a.ion 'o j^] '^^f ^ "^^^ 
defined as the site required in cis for DNA transfer. Tra genes include tra A, B, C. D, E F, G, H, ^ ^ ^ ^' ^1 
R S T U V W X Y;Z,virAB(alleles1-11),C,D,E,G,lHF,andFinOP.OriT,ssomet,mesalsodes,^^^^^^ 

gene, omer c^'lular eniymes. including those of the RecBCD pathway, RecA, SSB P-^i °'lre:BCD 

™gTnerj:xr==^:^ 

eLorlglnofrep.^^^^^^^^^ 

x z c^^y e olslLdrem^ 

f^omm^LmeincompatiblMy group aretransfected into thssame cell, the vectors trans-ently coexist for sufhcienttim 

a coniuitive bridge throu^ which DNA can transfer. This process actuates a site-spectfic nuclease «n~ded by a 
MOB geC^ich specifically cleaves DNA to be transferred at orfT. The cleaved DNA .s then threaded through the 

SrONtrsCCr^re eS^^ -en present as a component of the 'r^^^^^^^^ 

Sver, sol mobilizable vectors integrate intothe host chron»someand thereby mobil,^^^^^^^ 

chromosome The F ptesmid of E. coli, for example, integrates into the chromosome at high ^'^^".^^^"^.^^^^ 

ge eTuXctiona, from the site of integration, aher mobil^able vecU,rs do not ''P'^^"^- V 'n«e9~ « hos' 

?hr„,«ocoma at hinh efficiencv but can be induced to do by growth under particular conditions (e.g., treatment wim a 

"nragl'tlrrno^^^^ 

Ss)She^lods of recursive recombinatk:^, iterative cycles Of recombinat^ 

be ui^tSe thTgene(s) toward acquisition of a newor impioved property. As in any "^''^J'^'^'^'^''^''^: 
bfnat^ me first siepl to generate a iibraiy of diverse fom« of the gene or genes to be evolved. The diverse t<»ms 
crnTe ih7re^rof natur J diversity, the application of traditional mutagenesis methods (e.g., error-prone PCR or 

"ssene iSnelis) or the resu«\ any o? the other recombination -o-- ^P^'^ -.I SftTJ 
combinationofthese.ThenumberofdiversefomnscanvarywidelytromaboutlOtolOO, 10«, lO^, 0 or 10 .Often 

roTne ^0° Irest are mutagenized as discrete units. However, if the location of gene(s) is not known o a large 
number it gIVa™ to be evolved simuKaneously, inttial diversity can be generated by In situ mutagenesis of a chro- 

1^ of d'eritrnts of a gene or gene(s) is introduced into cells conta»,ing the apparatus necessary 

or^iuX t^st (assling mat the Itora^l not already contained in such cells), usually in ^^^^";^<^ 
sucMhat me genes can be expressed. For example, if the gene(s) are mutagenized m the absence of e«^ental reg^ 
u^tor™ces such as promoter, mese sequences are reattached before introduction "tc ~"s. S« if a 
f™?Ja gene is mutagenized in isolatton, *e mutagenesis products are usually reassociated wrth unchanged 
Sno seouences before ^ing introduced into cells. The apparatus necessary for conjugative transfer compnses a 
^tohrngroigirof'^sfLtogemerwithmemobandt. 

^^sfer to^ur These genes can be included in me vector, in one or more deferent vectors, or " 'he chror™»ome 
rt^r*eLton?sot me gene to be evoked is usually inserted intome vector containing me ongino^ 

Te fHo) However, in some sLtions the library of d^erse forms of me gene can be present in '^-^ '*^'<^<3 
or a s^ond ve«or, as well as, or instead of in the vector containing the orig»i of transfer. The library of d^erse fomis 

~es is^n emplated me vector can contain two'origins of replfcation, one functional in each cell type (i.e.. a 
sh treJtTAZLtlly « it is intended that transferred genes should integrate into 

er« Serable mat ml vector not contain an origin of replicatton functional in ^^[f '^'P''"';; "^^^^^^^^ 
vector) The oriT site and/or li^OB genes can be introduced into a vector by cloning or transposing me RK2mP4 Ij^B 

u^Sn?Gu"^,TMorB/o,. 162' 699-703 (1982)), or by cointegrate fon.at»n wHh a 

convenient method for large plasmids is to use TnS-Mob', which is me Tn5 transposon containing me onT of RP4. For 
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example. pUC-like mobitizable vectors pKl8 and pK19 (Schafer et al. (1995) Gene 145:69-73) are suitable starting 
vectors for cloninq the tra gene library to be evolved. 

^MMr Although^o. necessary, recombination is sometimes facilitated by insertin^^ 

different kinds of vectors having different incompatibiifty origins. Each vector should have an or T ar^d the cell shou d 
5 ZessM°Bandtrafunctionssuitableformobilizationofbothvecto,s.Useoftwosuchk,ndsofvecto^ 

Sence of multiple vectors within the samecelland increases the efficiency of recombination between the vectors^ 
10325] The collection of cells is propagated in any suitable media to altow gene expression to occur. Tra and mob 
genes are expressed and mediate transfer of the mobilizable vector between cells. If the d«erse '"'^^'V ° 
a mobilizable vector, its members are transferred as components of the vector. If the d«-erse library, or certain elements 
,0 of the library, are in trans to the mobilizable vector, they are transferred onty » the rr»bilizab e vector ' ««« 

proximate to the elements. As discussed above, integration frequently occurs spontaneously for the E. col, F plasmid 

and can be induced for other mobilizable vectors. . 
[0326] Asar9Sultoft,ansferofmembersofthediverselibrarybetweencells,someolthecellscometo^^^^^^^^^ 

than one member of the diverse library. The multiple members undergo recombination wrthin such cells senera^ng sM 
,5 urther diversity in a library of recombinant forms. In general, the longer celte are propagated f « '™'™/«=^°^^J"/"' 
fols are generated. Generally, recombination results in more than one recombination product within the same celL 
If both recombination products are on vectors and the vectors are from the same incompatibility group, one of the 
vectors is lost as the cells are propagated. This process o(«urs faster if the cells are propagated on sslect've "^«d,a 
in which one or other of the recombinant products confers a selective advantage. After a suitable period of recombi- 
20 nation, whch depends on the cell type and its growth cycle lime, the recombinant forms are subject '° screening^ 
selection Because the rscombinant fomis are already present in cells, this format for recombination is part cularh^ 
amenable to alternation with cycles of in v/vo screening or selection. The conditions for screening or selection, of 
course, depend on the property which it is desired that the gene(s) being evolved acquire or ''"P''="^^ '"^'^^^^'^J^ 
if the property is drug resistance, recombinant fomis having the best drug resistance can be selected by exposure to 
2S the dmg. Altematively. if a cluster of genes is being evolved to produce a drug as a secondary metabolite, cells bearing 
rec«nb^ant clusters of the genes can be screened by overlaying colonies of cell bearing recombinant wrth a 
lawn of cells that are sensith/e to the drug. Colonies having recombinant clusters resuKing in production o the best 
drug are identified from holes in the lawn. If the gene being evolved confers enhanced growth <=haract«ristKS. «Ns 
bearing the best genes can be selected by growth competition. Antibiotic production can be a growth rate advantage 
30 it cells are competing With Other cell types for growth. „.„„„i ih=i hauo 

[0327] Screening/selecton produces a subpopulation of cells expressing recombinant fomis of gene(s) that have 
evolved toward acquisition of a desired property. These recombinant forms can then be subjected to further rounds of 
recombination and screening/selectk,n in any order. For example, a second round <=^^"^o^'^^^^JZ S 
formed anatogous to the nrst resulting in greater enrichment for genes having evoked toward acquisition of the desiroJ 
35 property. Optksnally, the stringency of selection can be increased between rounds (e.g.. " 

the concentration of drug in the media can be increased). Further rounds of recombination can ^^^ ^ ^1°^^^r 
an analogous strategy to the first round generating further recombinant forms of the gene(s). '^^'"f^^^^"^^' 
rounds of recombination can be peHormed by any of the other molecular breeding fomats discussed. Eventually, a 
recombinant form of the gene(s) is generated that has fully acquired the desired property 
40 [03281 Fig. 30 provides an example of how a drug resistance gene can be evolved by conjugative transfer. Panel A 
Shows a library of diverse genes cloned into a mobilizable vector bearing as oriT. The vectors «™ P'^«"' 
containing a second vector whfch provWes tra functions. Conjugative transfer results in movement of the mobilizable 
vectors between cells, such that different vectors bearing different var^t fomis of a gene occupy 0^^^^*^ 
shown in panel B. The different forms of the gene recombine to give the products shown in panel C. Wter conjugahon 
« and recombination has proceeded for a desired time, cells are selected to identify those containing the recombined 
aenes, as shown in panel D. . ^. ^. 4.„,„:„ 

[0329] In one aspect, the alternative shuffling method includes the use of intra-plasmidie recombination, wherein 
Laries of sequence-recombined polynucleotide sequences are obtained by genetic ™<=°f' ^ ^ °' 
sequence repeals located on the same plasmid. In a variatkHi. the sequences to be recombined are by site- 

so specific recombination sequences and the polynucleotides are present in a srte-specific recombination system, such 
as an inlegron (Hall and Collins fl99S> Mol. Marobiol . J5: 593. incorporated herein by reference). 
[0330] in an aspect of the invention, mutator strains of host cells are used to enhance recombinaton of more highly 
mismitched sequence-related polynucleotdes. Bacterials strains such as MutL, MutS, or MutH - "-^^ 
ing the Mut proteins (XL-lred; Stratagene, San Diego, CA) can be used as host cells for shuffling of sequence-related 
« polynucleotides by in vhrd recombination. Other mutation-prone host eel types can also bo "sf such as those hav,ng 
apLlreading^efelt^polymerase(Fosterelal.(1995)Proc>!atLA^^ 

by reference), aher in vivo mutagenic formats can be employed, including adminstering chemical or radiotogical mu- 
tagens to host cells. Ex^ples of such mutagens include but are not limited to: ENU, MMNG. nitrosourea. BuDR, and 
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[03311 Shuffling can be used to evolve polymerases capable of incorporation of base analogs in P^R or PCFMike 
amplification reactions. A DNA polymerase which is evolved to use base analogs can be used to copy DNA by PGR 
into a chemical form which gives more resolvable fragmentation patterns in mass spectrometry, such as for mass 
spectrometry DNA sequencing. The base analogs can have fewer and/or more favorable fragmentation sites tp en- 
hance or facilitate the interpretation of the mass spectrum patterns. 

r03321 Variant pofymerases can also be evolved by recursive sequence recombination to incorporate non^iatural 
nucleotides or nucleotide analogs, such as phosphorothioate nucleotides. Phosphorothioate nucleotides made w. h 
such variant polymerases can provide many uses, including naked DNA gene therapy vectors which are resistant to 
nuclease degradation, aher examples of properties of polymerases which can be modified via recursh^e sequence 
recombination include, but are not limited to. processivity, error rate, proofreading, thermal stability, oxidation resist- 
ance nucleotide preferences, template specificity, and the like, among others. 

[03331 In an embodiment, fluorescence-activated cell sorting or analogous methodology is used to screen for host 
cells typically mammalian cells, insect cells, or bacterial cells, comprising a library member of a recursively recorribined 
sequence library, wherein the host cell having a library member conferring a desired phenotype can be selected on 
the basis of fluorescence or optteal density at one or more detection wavelengths. In one embodiment, for example 
each library member typically encodes an enzyme, which may be secreted from the cell or may be intracellular, arid 
the enzyme catalyzes conversion of a chromogenic or fluorogenic substrate, which may be capable of diffusing into 
the host cell (e.g.. if said enzyme is not secreted). Host cells containing library members are contained in A^id drops 
or qel drops and passed by a detection apparatus where the drops are illuminated with an excitation wavelength and 
a detector measures either fluorescent emission wavelength radiation and/or measures optical densrty (absoption) at 
one or more excitatory wavelength (s). The cells suspended in drops are passed across a sample detector under con- 
ditions wherein only about one individual cell is present in a sample detection zone at a time. A source, illuminates 
each cell and a detector, typically a photomultiplier or photodiode, detects emitted radiation. The detector controls 
gating of the cell in the detection zone into one of a plurality of sample collection regions on basis of the signaK^^ 
detected. A general description of FACS apparatus and methods in provided in U.S. Patents 4.172 227; 4.347.935 
4.661.913; 4,667,830; 5,093,234; 5.094.940; and 5.144,224, incorporated herein by reference. A suitable alternative 
to convnetional FACS is available from One Cell Systems. Inc. Cambridge, MA. 

[0334] As can be appreciated from the disclosure above, the present invention has a wide variety of applications. 
Accordingly, the following examples are offered by way of illustration, not by way of limitation. 

EXPERIMENTAL EXAMPLES 

[0335] In the examples below, the following abbreviations have the following meanings. If not defined below, then 
the abbreviations have their art recognized meanings. 

ml = milliliter 

|il = microliters 

\iM = micromolar 

nM - nanomolar 

PBS = phosphate buffered saline 

ng = nanograms 

^g - micrograms 

IPTG = isopropylthio-^-D-galactoside 

bp - basepairs 

kb = kilobasepairs 

dNTP = deoxynucleoside triphosphates 

PCR = polymerase chain reaction 

X-gal = 5-bromo-4-chloro-3-indolyl-p-D-galactoside 

DNAsel = deoxyribonuclease 

PBS = phosphate buffered saline 

CDR = complementarity determining regions 

MIC = minimum inhibitory concentration 

scFv = single-chain Fv fragment of an antibody 

[03361 In general, standard techniques of recombination DNA technology are described in various publications, e. 
Q Sambrook et al.. 1989. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory; Ausubel e^ al.. 
1987 Current Protocols in Molecular Biology, vols. 1 and 2 and supplements, and Berger and Ktmmel, Methods in 
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Enzvmoloav. Volume 152. Guidato Molecular Cloning Techniques (1987), Academic Press, Inc., San Diego. CA. each 
of which is incorporated herein In their entirety by reference. Restriction enzymes and polynucleotide modifying en- 
zymes were used according to the manufacturers recommendations. Oligonucleotides were synthesized on an Applied 
Biosystems Inc. Model 394 DNA synthesizer using ABI chemicals. If desired. PGR amplimers for amplifying a prede- 
termined DNA sequence may be selected at the discretion of the practitioner 



EXAMPLES 

Example 1 . LacZ alpha gene reassembly 
1 ) Substrate preparation 



[0337] The substrate for the reassembly reaction was the dsDNA polymerase chain reaction ('PCR') product of the 
wild-type LacZ alpha gene from pUCIS. (Fig. 2) (28; Gene Bank No. X02514) The primer sequences were 
15 5'AAAGCGTGGATTTTTGTGAT3' (SEQ ID NO:1) and 5'ATGGGGTTCCGCGCACATTT3' (SEQ ID N0:2). The free 
primers were removed from the PGR product by Wizard PGR prep (Promega, Madison Wl) according to the manufac- 
turer's directions. The removal of the free primers was found to be important. 

2) DNAsel digestion 

20 

[0338] About 5 ^g of the DNA substrate was digested with 0.15 units of DNAsel (Sigma. St. Louis MO) in 100 ^il of 
[50 mM Tris-HGI pH 7.4, 1 mM MgCIa), for 10-20 minutes at room temperature. The digested DNA was run on a 2% 
low melting point agarose gel. Fragments of 10-70 basepairs (bp) were purified from the 2% low melting point agarose 
gels by electrophoresis onto DE81 ion exchange paper (Whatman, Hillsborough OR). The DNA fragments were eluted 
25 from the paper with 1 M NaGI and ethanol precipitated. 

3) DNA Reassembly 

[0339] The purified fragments were resuspended at a concentration of 1 0 - 30 ng/|il in PGR Mix (0.2 mM each dNTP, 
30 2 2 mM MgClj, 50 mM KGI, 1 0 mM Tris-HCI pH 9.0, 0. 1% Triton X-1 00, 0.3 ^1 Taq DNA polymerase. 50 [i\ total volume). 

No primers were added at this point. A reassembly program of 94''G for 60 seconds. 30-45 cycles of [94"G for 30 

seconds, 50-55'G for 30 seconds, 72*'G for 30 seconds] and 5 minutes at 72'G was used in an MJ Research (Watertown 

MA) PTC-150 thermocycler. The PGR reassembly of small fragments into larger sequences was followed by taking 

samples of the reaction after 25. 30, 35 ,40 and 45 cycles of reassembly (Fig. 2). 
OS [0340] Whereas the reassembly of 100-200 bp fragments can yield a single PGR product of the correct size, 10-50 

base fragments typically yield some product of the correct size, as wel I as products of heterogeneous nrrolecular weights. 

Most of this size heterogeneity appears to be due to single-stranded sequences at the ends of the products, since after 

restriction enzyme digestion a single band of the correct size is obtained. 

40 4) PGR with primers 

[0341] After dilution of the reassembly product into the PGR Mix with 0.8 ^iM of each of the above primers (SEQ ID 
Nos: 1 and 2) and about 15 cycles of PGR. each cycle consisting of [94*G for 30 seconds. 50»G for 30 seconds and 
72''C for 30 seconds], a single product of the correct size was obtained (Fig. 2). 

45 

5) Cloning and analysis 

[0342] The PGR product from step 4 above was digested with the terminal restriction enzymes BamH and Eco0109 
and gel purified as described above in step 2. The reassembled fragments were ligated into pUCiS digested with 

so BamHI and Eco0109. E. co// were transfomned with the ligation mixture under standard conditions as recommended 
by the manufacturer (Stratagene. San Diego GA) and plated on agar plates having 100 jig/ml ampicillin, 0.004% X-gal 
and 2mM IPTG. The resulting colonies having the HirD\\\-Nhe\ fragment which is diagnostic for the ++ recombinant 
were identified because they appeared blue. ^ a -rn 

[0343] This Example illustrates that a 1 .0 kb sequence carrying the LacZ alpha gene can be digested into 1 0-70 bp 

55 fragments, and that these gel purified 1 0-70 bp f ragn^ents can be reassembled to a single product of the correct size, 
such that 84% (N=377) of the resulting colonies are LacZ+ (versus 94% without shuffling; Fig. 2). 
[0344] The DNA encoding the LacZ gene from the resulting LacZ" colonies was sequenced with a sequencing kit 
(United States Biochemical Co., Cleveland OH) according to the nrmnufacturer's instructions and the genes were found 
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to have point mutations due to the reassembly process (Table 1). 11/12 types of substitutions were found, and no 
frameshifts. 

TABLE 1 



Mutations Introduced by mutagenic shuffling 


Transitions 


Frequency 


Transversions 


Frequency 


G-A 


6 


A-T 


1 


A-G 


4 


A-C 


2 


C-T 


7 


C-A 


1 


T-C 


3 


C-G 


0 






G-C 


3 






G-T 


2 






T- A 


1 






T-G 


2 



[0345] A total of 4,437 bases of shuffled lacZ DNA were sequenced. 

[0346] The rate of point mutagenesis during DNA reassembly from 10-70 bp pieces was determined from DNA se- 
quencing to be 0.7 % (N=4,473). which is similar to error-prone PGR. Without being limrted to any theory it is believed 
that the rate of point mutagenesis may be lower if larger fragments are used for the reassembly, or if a proofreading 

r0347r^^W^^^^ DNA from 14 of these point-mutated LacZ- colonies were combined and again reassembled/ 

shuffled by the method described above. 34% (N=291 ) of the resulting colonies were LacZ*. and these colonies pre- 
sumably arose by recombination of the DNA from different colonies. 

[0348] The-expected rate of reversal of a single point mutation by error-prone PGR, assuming a mutagenesis rate 
of 0.7% (10). would be expected to be <1%. .u ♦ 

[0349] Thus large DNA sequences can be reassembled from a random mixture of small fragments by a reaction that 
is surprisingly efficient and simple. One application of this technique is the recombination or shuffling of related se- 
quences based on homology. 

Example 2. LacZ gene and whole plasmid DNA shuffling 

1) LacZ gene shuffling 

[0350] Crossover between two markers separated by 75 bases was measured using two LacZ gene constructs. Stop 
codons were inserted in two separate areas of the LacZ alpha gene to sen/e as negative markers. Each kens a 
25 bp non-homologous sequence with four stop codons, of which two are in the LacZ gene reading frame. The 25 bp 
non-homologous sequence is indicated in Figure 3 by a large box. The stop codons are either boxed or underlined. A 
1-1 mixture of the two 1.0 kb LacZ templates containing the +- and -+ versions of the LacZ alpha gene (Fig. 3) was 
digested with DNAsel and 100-200 bp fragments were purified as described in Example 1 . The shuffling program was 
conducted under conditions similar to those described for reassembly in Example 1 except 0.5 nl of polymerase was 
addedandthetotalvolume was 100^1. 

[0351] After cloning, the number of blue colonies obtained was 24%; (N=386) which is close to the theoretical max- 
imum number of blue colonies (i.e. 25%), indicating that recombination between the two markers was complete. All of 
the 10 blue colonies contained the expected HindW-Nhd restriction fragment. 

2) Whole plasmki DNA shuffling 

[03521 Whole 2.7 kb plasmids (pUGl8-+ and pUCl8+-) were also tested. A 1 :1 mixture of the two 2.9 kb plasmlds 
containing the and versions of the LacZ alpha gene (Fig. 3) was digested with DNAsel and 100-200 bp fragments 
were purified as described in Example 1 . The shuffling program was conducted under conditions similar to those de- 
scribed for reassembly in step (1) above except the program was for 60 cycles [94'C for 30 seconds. 55 C for 30 
seconds 72»C for 30 seconds]. Gel analysis showed that after the shufning program most of the product was greater 
than 20 kb. Thus, whole 2.7 kb plasmids (pUG18 -+and pUCl8 +-) were efficiently reassembled from random 100-200 
bp fragments without added primers. . 
[0353] After digestion with a restriction enzyme having a unique site on the plasmid rEccOl 09^, most of the product 
consisted of a single band of the expected size. This band was gel purified, religated and the DNA used to transform 
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E The transformants were plated on 0.004% X-gal ptetes as described in Example 1 . 1^ 
plasmids were blue and thus ++ recombinants. 

3) Spiked DNA Shuffling 

103541 Oligonueleotidesthatarernixedintolheshuffllr,9r,,ixturecanbeincorpo«tedintothete 

01 the flar,kir,g sequences of the oligonucleotide to the template ONA (Fig. 4 The ^<^^<=P ^" 
n^ri^nUC^S !l described above was used as the DNAsel digested temptate. A 66 mer oligonucleotide, including 
^^llf^lr^Tr^MP- LacZ gene at both ends was added into the reactton at a 4^old motor excess 
o »r ^t stop ^^riumtLs present in the original gene. The shuffling reaction was ^onduc'ed under ccxidrtions 
lXto.htein1tcp2above.Th^resul.ing product wasdigestedjigated and hsert^ 



Table 2 





% blue colonies 


Control 

Top strand spike 

Bottonn strand spike 

Top and bottom strand spike 


0.0(N>1000) 

8.0 (N=855) 
9.3(N=620) 

2.1 (N=537) 



103551 ssDNA appeared to be more efficient than dsDNA presumab^ due to competit^e hybridization. The degree 
of inc^orpo^i^ ^be varied over a wkie range by adiusting the ,rK,lar excess, annealing temperature, or the leng«. 
of homology. 

Pxample 3. DMA raassembly 'he complflia absence of primers 

r03S61 Plasmid pUC18 was digested with restriction enzymes EcoRI, £cc0109, Xmd and ^'"^l' 

S «^roxi.mlelv 370 -^0 770 and 1080 bp. These fragments were electrophoresed and separately purified from a 

S^etrpoima^r^ege. (the 3703^ 460 basepai^ 

a medium fragment and a mixture of two small fragments in 3 separate tubes. „, «, i an bo were 

[03571 Each fragment was digested witt, DNAsel as described in Example 1 and fragments of 50-130 bp were 
r., .riliori frnm a 2% low melting point agarose gel for each of the original fragments. 
0 ^''pCRm^aTdescribld'^nExLpiel\bove)was 

ration of 10 no/iil of fragments. No primers were added. A reassembly reaction was perfomted for 75 cyctes [94 C tor 

^^tr^flTcZ'tZ^^'^r^^^^^^^ 770 and the 370 and 460 bp bands reformed efficient, from the 
purified fragments, demonstrating that shuffling does not require the use of any primers at all. 

Example 4. II -1 p gene shuffling 

[0360] This example illustrates that crossbars based on homologies of less than 15 bases may be obtained. As an 
example, a human and a murine IL-ip gene were sfiuffled. Svstems Inc 

rai^l^'The first 15 cycles of the shuffling reaction were performed with the Klenow fragment of DNA P^jV"^^'^ 
[0363] I ne J ^ Pl^^ ^as added to the PGR mix of Example 1 which mix lacked the 
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ScCGCATGC JgCTTGGATCCTTATT3- (SEQ ID N0:5) and 5'AAAGCCCTCTAGATGA1TACGAATTCATAT3 
%to?0mWaPCR rsaCton was per(om,ed as described above in Exannple 1 . The second pnmer.pa.r differed 
from the first pair only because a change in restriction sites was deemed necessary. 

roaesi After diaestton of the PCR product with Xtel and Spti. the fragments were ligated into Xtel-Sp/iM ges ed 
U^iTrsirc^o<.heinserts'romsevera.colonieswere^ 
states Biochemical Co , Cleveland OH) according to the manufacturer's instructions. 

AtoLTd lycrissoverswerefoundby DNA sequencing of nine colonies. Some of the crossovers were based 

S ^;f:a^r "S"^^^^^^^^^^^ based on short homologies, a very low e,e~^^^ 

^^L L is reouired With any heat-stable polymerase, the cooling time of the PCR machine (94 0 to 25 C at 1 2 

Zer-^'ca^^^^^^ 

none of the protocols based on Taq poVmerase yielded crossovers, even when a '«"f 'f ^'f/ °' ""^ °^ , ^'^^ 
genes was used. In contrast, a heat-labile polymerase, such as the Klenow fragment of DNA polymerase I, can 
used to accurately obtain a low annealing temperature. 

Fvamole 5. DNA shuffling of the TEM-\ be lalaelamase gene 

103691 The utility of mutagenic DNA shuffling lor directed molecubr evolution was tested in a b^talactamase model 
S TEI^-I betalactamase is a very efficient enzyme, lilted in its reactton late primanly by diffusion^ 
dCmineswhler it is possible to change its reaction specific«y and obtain resistance to the drug cefotaxime that ,t 

ST'^hrn^Jn— ^^^^^ 

hv olating 10 a 10-^ di^^^ an overnight bacterial culture (about 1000 cfu) of E. cof/XLI -blue cells (Stratagene. 
San So CaI ol, p^tes 1 varying levels of cefotaxime (Sigma, St. U^ufe MO), followed by incubatK», for 24 hours 

^MTlT' Growth on cefotaxime is sensitive to the density of cells, and therefore similar numbers of cells needed to be 
S'eL cn"pbte (obtained by plating on plain LB p^.es). Ratings of 1000 cells were consistently pertomted. 

1) Initial Plasmid Constmction 

103721 A OUC18 derivative carrying the bacterial TEf^-1 betalactamase gene was used (28) The TEM-1 "etelacta- 
Sen?confers resistance to'^.aLr^ against approximately 0.02 .g/ml of cefotaxime. St. restr^Uon s«es were 
.dedlof^ep^^^^^^^^^^^ 

PCR of the betalactamase gene sequence with two other primers: . .™^tt ^„ri 

Primer C (SEQ ID NO:9): 5-AACTGACCACGGCCTGACAG6CCGGTCTGACAGyA^^^^^^ and 

P iZd SEQlDNO-10):5-AACCTGTCCTGGCCACCATGGCCTAAATACATTCAAATATGm 

lom ^e?wo reaction products were dig^with Sfil, mixed, ligated and used t° 'ransfom, bactena. 

Ki The irsulting plasmid was pUC162Sfi. This pbsmid contains an Sm fragment carrying the TEM-I gene and 

lo37Sr^^e—minh«,Kory concentration of cefotaximefocE.^^^^^ 

Sw^e replg of a diluted pool of cells (approximate^ 10' cfu) on 2-fold increasing drug levete. Resistance up 
?o 1 X^ml <^uld be Obtained without shuffl».g. This represented a 64 fold increase in resistance. 

2) DNAsel digestion 

[0377] The substrate for the first shuffling reaction was dsDNA ol 0.9 kb obtained by PCR of pUCI 92Sfi with primers 
[0:7TTr,rtT'ni'er:,he'^^^^^^^ 

Jm79] About 5 ^.g of the DNA substrate(s) was digested with 0.15 units of DNAsel (Sigma, St. Louis MO) in 100 ^1 
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nf mWl Tris-HCl dH 7 4 1 mM MgCU, for 10 min at room temperature. Fragments of 100-300 bp were purified from 
2^1^^^^^^ -to DESI ion exchange paper (Whatman. Hillsborough OR), 

elution with 1 M NaCl and ethanol precipitation by the method described in Example 1 . 



3) Gene shuffling 



[03801 The purified fragments were resuspended in PCR mix (0.2 mM each dNTP. 2.2 mM MgClj 50 mM KCUO 
mM Tds-HCI pH 9.0. 0.1% TrSon X-100), at a concentration of 10 • 30 ng/nl. No pnmers were added at this point. A 
reassembly program of 94-C for 60 seconds, then 40 cycles of (94«C lor 30 seconds, 50-55-C for 30 seconds, 72-C 
!rST'l™ ,h«n 72-C (or 5 minutes was used in an MJ Research (Watertown MA) PTC-ISO thermocycler. 



for 30 seconds) and then 72°C lor 5 minutes was 
4) Amplification of Reassembly Product with primers 



(03811 After dilution of the reassembly product into the PCR mix with 0.8 of each primer (C and D) and 20 PCR 
cycles 194-C for 30 seconds, 50-C for 30 seconds, 72-C lor 30 seconds] a single product 900 bp ,n size was obtained. 



5) Cloning and analysis 



[0382] After digestion of the 900 bp product with the terminal restriction enzyme Sffl and ^"^^^^^P""^^' 
the 90) bp product was ligated into thevectorpUC182Sfi at the unique S/fl Site with T4DNAUgase(BR^^ 
MD) The mixture was electroporated into £ «>//XL1-blue cells and plated on LB plates w«h 0.32-0.64 ^g/ml of cefo- 



taxime (Sigma, St. Louis MO). The cells were grown for up to 24 hours at ST-C and the resulting colonies were scraped 
off the plate as a pool and used as the PCR template for the next round of shuffling. 

2S 6) Subsequent Reassembly Rounds 

[03831 Tlte transfomiants obtained after each of three rounds of shuffling were plated on increasing levels of cefo- 
Lime. The colonies (>100, to maintain dwersity) from the plate wSh the highest level of cefotaxime were pooled and 
used as the template lor the PCR reaction for the next round. .u-..»,„i=,» 
30 [0384] A mixture of the cefotaxime' colonies obtained at 0.32-0.64 ^g/ml in Step (5) above were used as the tsmplate 
for the next round of shuffling. 10 ul of cells in LB broth were used as the template in a reassembly program of 10 
17esat99-G,then35cycles of [94-Cfor 30 second.. 52-Clor 30 seconds. 72-C for 30 se^^^^ 

mm ''j^eTea^iS'p'o^''^^ were digested and ligated into pUC1B2Sfi as described in step (5) above. The 
3S mixture was electroporated into £ coKXLI-blue cells and plated on LB plates having 5:10j.g/m( of <^^^^, 
[0386] colonies obtained at 5-10 ng/ml were used to, a third round similar to the first and second rounds except the 
cells were plated on LB plates having 80-160 ^g^ml ol cefotaxime. After the third round, «5^=n.es 
80-160 iig/ml and after replating on increasing concentrations of cefotaxime, colonies could be obtained at up to 320 

. 0^°^%" — is^^^^^^^^^^ the eel, density, requiring that all the MICs be standardized (in our 

case to about 1 000 cells per plate). At higher cell densities, growth at up to 1 280 ^g/ml was obtained^The 5 targest 
cofonles grol at 1,280 M^mi were plated for single colonies twk:e, and the Sfil inserts were analyzed by reslricUon 

K After selection, the plasmid of selected clones was transferred back into wild-type £ coli XL1 -blue cells 
Stratagene, San Diego OA) to ensure that none of the measured drug resistance was due to chrornosomal ™tat"ns^ 
0390] Threecycles of shuffling andsolectionyieldeda1.6x10*-foldincreaseinthe minimum mhib«ory«H.centra^^^^^ 

of the extended broad spectrum antibiotic cefotaxime for the TEM-1 betalactamase. In contrast, repeated plating without 
so shuffling resulted in only a 16-lold increase in resistance (error-prone PCR or cassette mutagenesis). 

7) Sequence analysis 

(03911 All 5 ol the tergest colonies grown at 1,280 ng/ml had a restriction map identical to the wild-type TEM-1 
SB erzyl.TheSfflinsertoftheplasmidobtainedfromoneolthesecolonieswassequencedbydideoj^^^^^^ 

(United States Biochemical Co., Cleveland OH) according to the manufacturer's ^structions^ All the ba«« numbeRi 

correspond to the revised PBR322 sequence (29), and the aminoacid numbers correspond o the ABL 

bering scheme (30). The amino acids are designated by their three letter codes and the nucleotides by their one letter 
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codes The term G4205A means that nucleotide 4205 was changed from guanidine to adenine. 
[0392] Nine single base substitutions were found. G4205A is located between the -35 and -1 0 sites of the betalacta- 
mase P3 promoter (31) . The promoter up-mutant observed by Chen and Clowes (31) is located outside of the Sfil 
fragment used here, and thus could not have been detected. Four mutations were silent (A3689G, G3713A. G3934A 
and T3959A). and four resulted in an amino acid change (C3448T resulting In Gly238Ser. A3615G resulting in 
Met182Thr, c'3850T resulting in Glul04Lys. and G4107A resulting in AlalBV^I). 

8) Molecular Backcross 

[0393] Molecular backcrossing with an excess of the wild-type DNA was then used in order to eliminate nonnessential 

[0394]°" Molecular backcrossing was conducted on a selected plasmid from the third round of DNA shuffling by the 
method identical to normal shuffling as described above, except that the DNAsel digestion and shuffling reaction were 
performed in the presence of a 40-fold excess of wild-type TEM-1 gene fragment. To make the backcross more efficient, 
very small DNA fragments (30 to 100-bp) were used in the shuffling reaction. The backcrossed mutants were again 
selected on LB plates with 80-160 ^ig/ml of cefotaxime (Sigma, St. Louis MO). 

[0395] This backcross shuffling was repeated with DNA from colonies from the first backcross round m the presence 
of a 40-fold excess of wild-type TEM-1 DNA. Small DNA fragments (30-100 bp) were used to increase the efficiency 
of the backcross. The second round of backcrossed mutants were again selected on LB plates with 80-160 jig/ml of 
cefotaxime. . 
[0396] The resulting transformants were plated on 1 60 ^g/ml of cefotaxime, and a pool of colonies was replated on 
increasing levels of cefotaxime up to 1 .280 ^ig/ml. The largest colony obtained at 1 ,280 fig/ml was replated for single 
colonies. 

[0397] This backcrossed mutant was 32.000 fold more resistant than wild-type. (MIC=640 jxg/ml) The mutant strain 
is 64-fold more resistant to cefotaxime than previously reported clinical or engineered TEM-1 -derived strains. Thus, it 
appears that DNA shuffling is a fast and powerful tool for at least several cycles of directed molecular evolution. 
[0398] The DNA sequence of the Sffl insert of the backcrossed mutant was determined using a dideoxy DNA se- 
quencing kit (United States Biochemical Co.. Cleveland OH) according to the manufacturer's instructions (Table 3). 
The mutant had 9 single base pair mutations. As expected, all four of the previously identified silent mutations were 
lost reverting to the sequence of the wild-type gene. The promoter mutation (G4205A) as well as three of the four 
amino acid mutations (Glul04Lys. Metl82Thr. and Gly238Ser) remained in the backcrossed clone, suggesting that they 
are essential for high level cefotaxime resistance. However, two new silent mutations (T3842C and A3767G). as well 
as three new mutations resulting in amino acid changes were found (C3441 T resulting In Arg241 His. C3886T resulting 
in Gly92Ser and G4035C resulting in Ala42Gty). While these two silent mutations do not affect the protein pnmary 
sequence, they may influence protein expression level (for example by mRNA structure) and possibly even protein 
folding (by changing the codon usage and therefore the pause site, which has been implicated in protein folding). 

Table 3 





Mutations in Betalactamase 


Mutation Type 


Non- Backcrossed 


Backcrossed 


amino acid 


AlalSLys 




change 


Glu104Lys 


Glu104Lys 




Met182Thr 


Metl82Thr 




Gly238Ser 


Gly23eSer 






Ala42Gly 






Gly92Ser 


silent 


T3959A 






G3934A 






G3713A 






A3689G 








T3842C 






A3767G 


promoter 


G4205A 


G4205A 



[0399] Both the backcrossed and the non-backcrossed mutants have a promoter mutation (which by itself or in com- 
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bination results in a 2-3 fold increase In expression level) as well as three comn^on amino acid changes (GluI04Lys 
Metie2Thr and Gly238Ser). Glul04Lys and Gty238Ser are mutations that are present tn several cefotaxime resistant 
or other TEM-1 derivatives {Table 4). 

s 9) Expression Level Comparison 

[0400] The expression level of the betalactamase gene in the wild-type plasmld. '^^^^''^ll^^l^''^^^^ 
fnTbackcrossed mutant was compared by SDS-polyacrylamide gel electrophoresis (4-20%. Novex. San Diego CA) 
of periplasmic extracts prepared by osmotic shock according to the method of Witholt B. (32)^ 
10 [04011 Purified TEM-I betalactamase (Sigma, St. Louis MO) was used as a molecular weight standard, and £ coh 

XLI-blue cells lacking a plasmid were used as a negative control. . . * , . 

[M02 mutant and the backcrossed mutant appeared to produce a 2-3 tokJ higher level of the betalac amase 
protein compared to the wild-type gene. The promoter mutation appeared to result in a 2-3 times increase ,n betalacta- 
mase. 

16 

Fvampia 6. Construction of mutant combinations of the TEM-I betalactamase gene 

[04031 To determine the resistance of different combinations of mutations and to compare the new mutants to pub- 
lished mutants, several mutants were constructed Into an Identical plasmld background. Two of th« — 
20 Glul04Lys and Gly238Ser. are known as cefotaxime mutants. All mutant combinations constructed had the promoter 
mutation, to allow comparison to selected mutants. The results are shown in Table 4. 

^0404] specific combinations of mutations were introduced into the wild-type pUCl82Sf. by PCR. using two otigo- 
nucleotides per mutation. 

[0405] The oligonucleotides to obtain the following mutations were: 

25 



30 



35 



40 



50 



55 
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Ala42Gly 

(SEQ ID no: 11) AGTTGGGTGGACGAGTGGGTTACATCGAACT and (SEQ ID NO: 12) 

AACCCACTCGTCCACCCAACTGATCTTCAGCAT ; 

Gln39Lys: 

(SEQ ID NO: 13) AGTAAAAGATGCTGAAGATMGTTGGGTGCAC GAGTGGGTT and 

(SEQ ID NO: 14) ACTTATCTTCAGCATCTTTTACTT ; 

Gl7926er: 

(SEQ ID NO: 15) AAGAGCAACTCJ^GTCGCCGCATACACTATTCT and (SEQ ID 

NO : 16 ) ATGGCGGCGACTGAGTTGCTCTTGCCCGGCGTCAAT ; 

6lul04Lys: 

(SEQ ID no: 17) TATTCTCAGAATGACTTGGTTAAGTACTCACCAGT CACAGAA and 

(SEQ ID NO: 18) TIAACCAAGTCATTCTGAGAAT ; 

Ketl82Tbr: 

(SEQ ID NO: 19) AACGACGAGCGTGACACCACGA£GCCTGTAGCAATG and (SEQ ID 
NO : 2 0 ) TCGTGGTGTCACGCTCGTCGTT ; 
Gly2388er alone: 

(SEQ ID NO: 21) TTGCTGATAAATCTGGAGCC^GTGAGCGTGGGTCTC GCGGTA and 

(SEQ ID NO: 22) XGGCTCCAGATTTATCAGCAA; 

Gly238Ser and Arg241His (combined): 

(SEQ ID NO: 23) AIGCTCACIGGCTCCAGATTTATCAGCAAT and 

(SEQ ID NO: 24) TCTGGAGCCAGTGAGCATGGGTCTCGCGGTATCATT; G4205A: 

(SEQ ID no: 25) AACCTGTCCT££C£ACCATaG££TAAATACAATCAAA 

TATGTATCCGCTIATGAGACAATAACCCTGATA . 

r04061 These separate PGR fragments were gel purified away from the synthetic oligonucleotides. 10 ng of each 
fragment were combined and a reassembly reaction was performed at 94»G for 1 minute and then 25 cycles; 94 C 
for 30 sec 50»C for 30 seconds and 72'C for 45 seconds). PGR was performed on the reassembfy product for 25 
cycles in the presence of the Sfil^ontaining outside primers (primers 0 and D from Example 5). The DNA was digested 
with S//1 and inserted into the wild-type pUCI 82Sfi vector. The following mutant combinations were obtained (Table 4). 

Table 4 



Name 


Genotype 


MIC 


Source of MIC 


TEM-1 


Wild-type 


0.02 






Glu104Lys 


0.08 


10 




Gly238Ser 


016 


10 


TEM-15 


Glu104Lys/Gly238Ser* 


10 




TEM-3 


Glu104Lys/Gly238Ser/GIn39Lys 


10 2-32 


37, 15 


ST-4 


Glul04Lys/Gly238Ser/Met182 Thr* 


10 




ST-1 


Glu104Lys/Gty238Ser/Met182 Thr/Alal8Val/T3959A/G37l3A/ G3934A/ 
A3689G* 


320 





• All of theao mutants additionally contain th« G420SA promotor mutation. 
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Name 


Genotype 


MIC 


Source of MIC 


ST-2 


Glu104Lys/Gly238Ser/Metl82Thr /Ala42Gly/Gly92Ser/Arg241 His/ 
T3842C/A3767G* 


640 




ST-3 


G!u104Lys/Gly238Ser/IVletl82Thr /Ala42Gly/Gly92Ser/Arg241 His* 


640 





20 



25 



30 



m07] It was concluded that conserved mutations account for 9 of 1 5 doublings in the MIC. 
04081 Glu!04Lys alone was shown to resutt only in a doubling of the MIC to 0.08 ^g/ml. and Gly238Ser (m severa 
contexts with one additional amino acid change) resulted only in a MIC of 0. 1 6 ^ig/ml (26). The double mutant GIul04Lys/ 
Gly238Ser has a MIC of 10 ^g/ml. This mutant corresponds to TEM-15. 

[0409] These same Glul04Lys and Gly238Ser mutations, in combination with Gln39Lys (TEM-3) or Thr263Met 
(TEM-4) result in a high level of resistance (2-32 ^g/ml for TEM-3 and 8-32 ^g/ml for TEM-4 (34. 35). 
[04101 A mutant containing the three amino acid changes that were consen/ed after the backcross (Glu104Lys/ 
Metl 82Thr/Gty238Ser) also had a MIC of 1 0 jig/ml. This meant that the mutations that each of the new selected mutants 
had in addition to the three known mutations were responsible for a further 32 to 64-fold increase in the resistance of 
the gene to cefotaxime. ..^ ^ u- *• * 

[0411] The naturally occurring, clinical TEM-1 -derived enzymes (TEM-1 -1 9) each contain a different combination of 
only 5-7 identical mutations (reviews) . Since these mutations are in well separated locations in the gene, a mutant 
with high cefotaxime resistance cannot be obtained by cassette mutagenesis of a single area. This may explain why 
the maximum MIC that was obtained by the standard cassette mutagenesis approach is only 0.64 ^g/ml (26). For 
example both the Glu104Lys as well as the Gly238Ser mutations were found separately in this study to have MICs 
below 0. 16 ^g/ml. Use of DNA shuffling allowed combinatoriality and thus the Glu104Lys/Gly23aSer combination was 
found, with a MIC of 10 ^g/ml. . , t^.i^^t 

[0412] An important limitation of this example is the use of a single gene as a starting point. It is contemplated that 
better combinations can be found if a large number of related, naturally occurring genes are shuffled. The diversity 
that is present in such a mixture is more meaningful than the random mutations that are generated by mutagenic 
shuffling For example, it is contemplated that one could use a repertoire of related genes from a single species, such 
as the pre-existing diversity of the immune system, or related genes obtained from many different species. 

Fxample 7. Improvement of antibodv AlOB bv DNA shuffling of a librarv of all six mutant CDRs. 

[0413] The A10B scFv antibody, a mouse anti-rabbit IgG, was a gift from Pharmacia (Milwaukee Wl). The commer- 
cially available Pharmacia phage display system was used, which uses the pCANTAB5 phage display vector 
[0414] The original A10B antibody reproducibly had only a low avidity, since clones that only bound weakly. to im- 
mobilized antigen (rabbit IgG). (as measured by phage ELISA (Phamnacia assay kit) or by phage titer) were obtained. 
The concentration of rabbit IgG which yielded 50% inhibition of the AlOB antibody binding in a competition assay was 
13 picomolar The observed low avidity may also be due to instability of the AlOB clone. 

[041 5] The A1 OB scFv DNA was sequenced (United States Biochemical Co. , Cleveland OH) according to the man- 
ufacturer's instructions. The sequence was similar to existing antibodies, based on comparison to Kabat (33). 

1) Preparation of phage DNA 

r0416l Phage DNA having the AlOB wild-type antibody gene (10 ul) was incubated at 99^C for 1 0 min, then at 72«C 
for 2 min PCR mix (50 mM KCl. 10 mM Tris-HCI pH 9.0. 0.1% Triton X-100. 200 each dNTP. 1.9 ^^1^ ^3^"). 0-6 
urn of each primer and 0.5 jil Taq DNA Polymerase (Promega. Madison Wl) was added to the phage DNA. A PCR 
program was njn for 35 cycles of [30 seconds at 94'C, 30 seconds at 45'C. 45 seconds at 72»C]. The pnmers used were: 
5' ATGATTACGCCAAGCTTT 3' (SEQ ID NO:26) and 

5' TTGTCGTCTTTCCAGACGTT 3' (SEQ ID NO:27). ... 

[0417] The 850 bp PCR product was then electrophoresed and purified from a 2% low melting point agarose gel. 

2) Fragmentation 

[0418] 300 ng of the gel purified 650 bp band was digested with 0.18 units of DNAse 1 (Sigma. St. Louis MO) in 50 
mM Tris-HCI pH 7.5. 10 mM MgCI for 20 minutes at room temperature. The digested DNA was separated on a 2% low 
melting point agarose gel and bands between 50 and 200 bp were purified from the gel. 
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3) Construction of Test Library 

ro4191 The purpose of this experiment was to test whether the insertion of the CDRs wouk) be efficienl. 
MM ^^e to lo^ng CDR sequences having internal restriction enzyme sites were synthesized •CDR H means a 
'^R?n.h«chainandXD^^ 



CDR HI (SEQ ID NO: 34) 

5 .TTCTtKXrrACATCTTCACAGAATTCATCTAGATTTGGGTGAGGCAGACGCCTGAA3 • 
CDR K2 (SEQ ID NO: 35) 

5 . ACAGGGACTTGAGTGGATTGGAATCACAGTCAAGCTTATCCTTTATCTCAGGTCTCGAGT 
TCCAAGTACTTAAAGGGCCACACTGAGTGTA 3 ' 
CDR H3 (SEQ ID NO: 36) 

5 • TGTCTATTTCTGTGCTAGATCTTGACTGCAGTCTTATACGAGGATCCATTGGGGCCAAGG 

GACCAGGTCA 3' 

CDR Li (SEQ ID NO: 37) 

5 • AGAGGGTCACCATGACCTGCGGACGTCTTTAAGCGATCGGGCTGATGGCCTGGTACCAAC 
AGAAGCCTGGAT 3' 

COR L2 (SEQ ID NO: 38) 

S.TCCCCCAGACTCCrGATrrATTAAGGGAGATCTAAACAGCTGTTGGTCCCTTTTCGCTTCAGT 



CDR 1.3 (SEQ ID NO:39) 

5 • ATGCTGCCACTTATTACTGCTTCTGCGCGCTTAAAGGATATCTTCATTTCGGAGGGGGGA 
CCAAGCT 3 



r04211 The CDR oligos were added to the purified A1 OB antibody DNA fragments of between 50 to 200 "? from step 
ralo at a 10 fold ™lar excess. The PCR mix (50 mM KCI, 10 mM Tris-HCI pH 9.0. 0 1% Trrton x-lOO 1^ mM 
MoCr^OO u.^ each dNTP 0 3 m Taq DNA polymerase (Promega, Madison Wl), 50 ^ total volume) was added and 
thfshu«i?ng'^rS'n run lo, 1 min a. 94.c'l min at 72-C, and then 35 cycles: 30 seconds at 94.C, 30 seconds at 

iS^'iri : otrshL'ed mbcture was added ,o 100 m of a PCR mix ,50 mM KCi, 1° -"Jlf^^^PH^^^;^ 
t2Jy inn 500 urn each dNTP 1 9 mM MgCl, 0.6 ^M each of the two outside pnmers (SEQ ID N0.26 and 27. see 
Lt:, OMTS^NriierL )^^^^ 

afr-C « se~nds at 72^]. The resulUng mixture of DNA fragments of 850 basepa.r size was phenol/chlorolom. 
extracted and ethanol precipitated. 
[0423] The outside primers were: 

Outside Primer 1: SEQ ID NO:27 5' TTGTCGTCTTTCCAGACGTT 3' 

p^nt igal^e gel, and liga'ed into the pCANTAB5 expresston vector obtained from ^'-^^^-^^''^ll^'l^ 
'UdvectorvLselectroporated according tothe method set forth by Invitrogen (San D,egoCA).ntoTG1cells(Phar. 

maeia. Milwaul<eeWI) and plated tor single colonies. • k-n inmMTriB hcidH 9 0 

[0425 The DNAfrom the resulting colonies was added to 100 m of a PCR m* (50 7^"^^C^^° -"^/"'^-"^ 
0 1% Triton X-100 200 urn each dNTP, 1.9 mM MgCI, 0.6nM of Outside pnmer 1 (SEQ ID No. 27. see below) sk 
fnlwe Drimers SEQ^NOS:40-45; see below), and 0.5 jtl Taq DNA polymerase) and a PCR program was mn fo 35 
rvcts dTi second at 94-0. 30 seconds at 45'C. 45 seconds at 72-0. The sizes of the PCR products were deter- 
minlbla^roCi electrophoresis, and were used to determine which CDRs with restriction s«es were inserted. 
CDR Inside Primers: 

H 1 (SEQ ID NO:40) 5* AGAATTCATCTAGATTTG 3', 
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H 2 (SEQ ID NO:41) 5' GCTTATCCTTTATCTCAGGTC 3'. 
H 3 (SEQ ID NO:42) 5' ACTGCAGTCTTATACGAGGAT 3' 
L 1 (SEQ ID NO:43) 5' GACGTCTTTAAGCGATCG 3', 
L 2 (SEQ ID NO:44) 5' TAAGGGAGATCTAAACAG 3', 
L 3 (SEQ ID NO:45) 5* TCTGCGCGGTTAAAGGAT 3' 

10426] The sb( synthetic CDRs were inserted at the expected lections in the wild-type A10B antibody DNA (Figure 
7) These studies showed that, while each of the six CDRs in a specific clone has a small chance of being a CDR wrth 
a restriction site, most of the clones carried at least one CDR with a restriction site, and that any possible combination 
of CDRs with restriction sites was generated. 

4) Construction of Mutant Complementarity Determining Regions ("CDRs") 

r04271 Based on our sequence data six oligonucleotides corresponding to the six CDRs were rr^de^ The CDRs 
Kaba! definition) were synthetically mutagenized at a ratio of 70 (existing base):10:10:10. and ^^^^ ^"^f ^1^^^^^^ 
and 3" sides by about 20 bases of flanking sequence, which provide the homology for the incorporation of the CDRs 
when mixed into a mixture of unmutagenized antibody gene fragments in a molar excess. The resulting mutant se- 
quences are given below. 
Ollgos for CDR Library 

CDR HI (SEQ ID NO: 28) 

5 • TTCTnarTA--AT^^^^'^'^C'^'^f'rGATATAGACTGGGTGAGGCAGACGCCTGAA 3 • 

CDR H2 (SEQ ID NO: 29) 
ffft ftqTTGAAGGGCA GGGCCACACTGAGTGTA 3* 

CDR H3 (SEQ ID NO: 30) 

,.-^■^^^^J^^^^P^.^^.^^^^^^^fir-rtn1lf^&gTllTAGGCGCTAC^?TTGACTT^ 
GACCACGGTCA 3' 

CDR LI (SEQ ID NO: 31) 

5 • ACnCCCTrfirrf^TfTfi^^^-^^^-'^«f^g^Gg^g^SGT7VTRCGyTACATATATTGGTACCAAC 
AGAAGCCTGGAT 3' 



CDR L2 (SEQ ID NO: 32) 



CDR L3 (SEQ ID NO:33) 

ACCAAGCT 3* , 

Bold and underlined sequences were the mutant sequences synthesized using a mixture of nucleosides of 70:10:10: 
10 where 70% was the wild-type nucleoside. , , * 

A 10 toW molar exceL ot the CDR mutant oligos were added to the purified A10B « W DNA f-agrne^^^ 
between 50 to 200 bp in length from step (2) above. The PGR mix (50 mM KCI. 10 mM Tr«-HCl pH 9.0, 0.1% Tmon 
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x-1 00. 1 .9 mM MgCI. 200 each dNTP. 0.3 ^ Taq DNA polymerase (Promega. Madison WI) 50 m total volume) was 
added and the shuffling program run for 1 min at 94»C. 1 min at ITO, and then 35 cycles: (30 seconds at 94»C, 30 

- f .S?^^^^^^^^ added to 100 m Of a PCR mix (50 mM KCI, 10 mM Tris-HCI pH 9^0. 0.1% 

s Triton X-IOO. 200 urn each dNTP. 1 .9 mf^ f^gCI. 0.6 m each of the two outside primers (SEQ ID NO:26 and 27. see 

be^ow) 0 5 ^1 la^m^ polymerase) and the PCR program was run for 30 cycles of [30 seconds at 94»C. 30 seconds 

at 45°C, 45 seconds at 72=01. The resulting mixture of DNA fragments of 850 basepair size was phenol/chloroform 

extracted and ethanol precipitated. 

[0430] The outside primers were: 
w Outside Primer 1 : SEQ ID NO:27 5' TTGTCGTCTTTCCAGACGTT 3' 

Outside Primer 2: SEQ ID NO:26 5' ATGATTACGCCAAGCTTT 3' 

5) Cloning of the scFv antibody DNA into pCANTAB5 

75 r0431] The 850 bp PCR product was digested with the restriction enzymes Sffl and Wofl. purified from a low melting 
point agarose gel. and ligated into the pCANTAB5 expression vector obtained from ^^^armac^. MHwaukee W 
nqated vector was electroporated according to the method set forth by Invitrogen (San D.ego CA) into TGI cells (Phar- 
rmcia. Milwaukee WI) and the phage library was grown up using helper phage following the guidelines recommended 

20 ^[tmir^^^^ was generated In this fashion was screened for the presence of improved antibodies, using 

six cycles of selection. 

6) Selection of high affinity clones 

25 ro4331 1 5 wells of a 96 well microtrter plate were coated with Rabbit IgG (Jackson Immunoresearch Bar Harbor ME) 
at 10 UQ /welffor 1 hour at 37^C. and then blocked with 2% non-fat dry milk in PBS for 1 hour at 37 C. 
[0434] 100 ^il of the phage library (IxlOiOcfu)was blocked with 100^1 of 2% milkforSOminutesat room temper^^ 

and then added to each of the 1 5 wells and incubated for 1 hour at 37" C. 

liZzT^^lr^ the wells were washed three times with PBS containing 0.5% Tween-20 at 37'C for 10 minu es per 
30 wash Bound phage was eluted with 100 ^^ elution buffer (Gfycine-HCI. pH 2.2). followed by immed^te neutralization 
with 2M Tris pH 7.4 and transfectbn for phage production. This selection cycle was repeated soc times. 
r0436] After the sixth cycle, individual phage clones were picked and the relative affinrt.es were compared by phage 
EUSA. and the specificit^ for the rabbit IgG was assayed with a kit from Pharmacia (Milwaukee WI) according to the 
methods recommended by the manufacturer. ^ u,iiH t«n« AinR 

35 rG437] The best clone has an approximately 100-fold improved expression level compared with the wild-type AlOB 
when tested by the Western assay. The concentration of the rabbit IgG which yielded 50% inhibrtwn in a competrtK^n 
assay with the best clone was 1 picomolar. The best clone was reproducibty specific for rabbit antigen. The number 
of copies of the antibody displayed by the phage appears to be increased. 

40 Example 6. In vivo recombination via di rect repeats of partial genes 

r04381 A plasmid was constructed with two partial, inactive copies of the same gene (beta^actamase) todemonstrate 
that recombination between the common areas of these two direct repeats leads to full-length, act^e recombinant 

45 K] A pUCIB derivative carrying the bacterial TEM-1 betalactamase gene was used (Yanish-Perron et aM 985 
Gene 33-103-119) The TEM-1 betalactamase gene ('Bla") confers resistance to bacteria against approximately 0.02 
Hg/ml of cefotaxirne. Sfil restriction sites were added 5' of the promoter and 3' of the end of the betalactamase gene 
by PCR of the vector sequence with two primers: ^ ^^-r^ a-ttt o. 

Primer A fSEQ ID NO 46) 5' TTCTATTGACGGCCTGTCAGGCCTCATATATACTTTAGATTGATTT 3 

50 PRMER B (SEQ ID NO: 47) 5' TTGACGCACTGGCCATGGTGGCCAAAAATAAACAAATAGGGGTTCCGCGCAC 
ATTT3' 

and by PCR of the beta-lactamase gene sequence with two other primers. 



and by PCR of the beta-lactamase gene sequence with two otner primers. ^-r^^-rx o. 

Primer C (SEQ ID NO: 48) 5' AACTGACCACGGCCTGACAGGCCGGTCTGACAGTTACCAATGCTT 3 
Primer D (SEQ ID NO: 49) 5' AACCTGTCCTGGCCACCATGGCCTAAATACATTCAAATATGTAT 3 
[04401 The two reaction products were digested with Sfil. mixed. ligated and used to transform competent E colt 
Lter^ by the procedure described below The resulting plasmid was pUC182Sfi-Bla-Sfi. This plasm.d contains an 
Sm fragment carrying the Bla gene and the P-3 promoter. - 
[0441] The minimum inhibilon^ concentration of cefotaxime for E coil XL1 -blue (Strata gene. San D.ego CA) carrying 
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pUCl82Sfi-Bla-Sfi was 0.02 jig/ml after 24 hours at ST-C. 

r04421 The tetracycline gene of pBR322 was cloned into pUCl8Sfi-Bta-Sfi using the homologous areas, resulting in 
pBR322TetSfi-Bla-Sfi. The TEM-1 gene was then deleted by restriction digestion of the pBR322TetSfi-Bla-Sfi with Sspl 
and Fspl and blunt-end ligation, resulting in pUC322TetSfi-Sfi. ■ ^ . .. • 

[0443] Overiapping regions of the TEM-1 gene were amplified using standard PGR techniques and the following 

primers: 

Primer 2650 (SEQ ID NO: 50) 5' TTCTTAGACGTCAGGTGGCACTT 3' , , , , , c c rM n mo ■ 9 \ ^ • 

Primer 2493 (SEQ ID NO: 51) 5' TTT TAA ATC AAT CTA AAG TAT 3' Primer 2651(SEQIDN0.52)5 
TGCTCATCCACGAGTGTGGAGAAGTGGTCCTGCAACTTTAT 3;and 
Primer 2652 (SEQ ID NO: 53) ACCACTTCTCCACACTCGTGGATGAGCACTTTTAAAGTT 

r04441 The two resulting DNA fragments were digested with S//1 and BstXI and ligated into the Sfi site of 
pBR322TetSfi.Sf i. The resulting plasmid was called pBR322Sf i-BL-LA-Sf i. A map of the plasmid as well as a schematic 
of intraplasmidic recombination and reconstitution of functional beta-lactamase is shown in Figure 9- 
r04451 The plasmid was electroporated into either TG-1 or JC8679 E. co//cells. E. co// JG8679 is RecBC sbcA (Ohner 
et al 1993 NAR 21-5192). The cells were plated on solid agar plates containing tetracycline. Those colonies which 
grew were ihen plated on solid agar plates containing 1 00 ^g/mi ampicillin and the number of viable colonies counted. 
The beta-lactamase gene inserts in those transformant. which exhibited ampicillin resistance were ampW.ed by stand- 
ard PGR techniques using Primer 2650 (SEQ ID NO: 50) 5' TTCTTAGACGTCAGGTGGCACTT 3" and Pnmer 2493 
(SEQ ID NO: 51) 5' TTTTAAATCAATCTAAAGTAT 3* and the length of the insert measured. The presence ol a 1 kb 
insert indicates that the gene was successfully recombined, as shown in Fig. 9 and Table 5. 

TABLE 5 



Cell 


Tet Colonies 


Amp colonies 


Colony PGR 


TG-1 


131 


21 


3/3 at 1 kb 


JC8679 


123 


31 


4/4 at 1 kb 


vector control 


51 


0 





r04461 About 17-25% of the tetracycline-resistant colonies were also ampiciltin-reslstant and all of the Ampicillin 
resistant colonies had correctly recombined. as determined by colony PGR. Therefore, partial genes located on the 
same plasmid will successfully recombine to create a functional gene. 

Example 9. In vivo recombination via direct repe ats of full-length genes. 

10447] A plasmid with two full-length copies of different alleles of the beta-lactamase gene was constructed. Homol- 
ogous recombination of the two genes resulted in a single recombinant full-length copy of that gene. 
r04481 The construction of pBR322TetSfi-Sfi and pBR322TetSfl-Bla-Sfi was descnbed above. 
[0449] The two alleles of the beta-lactamase gene were constructed as follows. Two PGR reactions were conducted 
with pUG18Sfi-Bla-Sfl as the template. One reaction was conducted with the following primers. 
Primer 2650 (SEQ ID NO: 50) 5' TTCTTAGAGGTCAGGTGGGAGTT 3' ^^^^tx 
Primer 2649 (SEQ I D NO: 51 ) 5' ATGGTAGTGC ACGAGTGTGGTAGTGAGAGGCCGGTCTGACAGTTA CC AATGCTT 

3* 

The second PGR reaction was conducted with the following pnmers: -rA-roTAx o. 

Primer 2648 (SEQ ID NO: 54) 5' TGTCACTAGCACACTCGTGGACTAGCATGGCCTAAATACATTGAAA TATGTAT 3 
Primer 2493 (SEQ ID NO: 51) 5' TTT TAA ATC AAT CTA AAG TAT 3' 

[04501 This yielded two Bla genes, one with a 5' S//1 site and a 3' BstXl site, the other with a 5' BstX\ site and a 3 

^5ir* After digestion of these two genes with BstX^ and Sf/1, and ligation into the Sftl -digested pbsmid 
pBR322TetSfi-Sfi. a plasmid (pBR322-Sfi-2BLA-Sfi) with a tandem repeat of the Bla gene was obtained. (See Figure 

rM521 The plasmid was electroporated into E co// cells. The cells were plated on solid agar plates containing 1 5 ^g/ 
ml tetracycline Those colonies which grew, were then plated on solid agar plates containing 100 ^g/ml ampicillin and 
the number of viable colonies counted. The Bla inserts in those transformants which exhibited ampicillin resistance 
were amplified by standard PGR techniques using the method and primers described in Example 8. The presence ot 
a 1 kb insert indicated that the duplicate genes had recombined, as indicated in Table 6. 
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TABLE 6 



1 Cell 


Tet Colonies 


Amp Colonies 


Colony PCR 


1 TG-1 


28 


54 


7/7 at Ikb 


1 JC8679 


149 


117 


3/3 at1kb 


1 vector control 


51 


0 





Colony PCR confirmed that the tandem repeat was efficiently recombined to form a single recombinant gene. 
Example iO . Multiple cycles of direct repeat recombination • Interplasmidic 

[0453] In order to determine whether multiple cycles of recombination could be used to produce resistant cells more 
quickly multiple cycles of the method described in Example 9 were performed. 

[04541 The minus recombination control consisted of a single copy of the betalactamase gene, whereas the plus 
recombination experiment consisted of inserting two copies o1 betalactamase as a direct repeat. The tetracycline mark- 
er was used to equalize the number of colonies that were selected for cefotaxime resistance in each round, to com- 
pensate for ligation efficiencies. 

[0455] In the first round. pBR322TetSfi-Bla-Sfi was digested with Ect\ and subject to PCR with a 1:1 mix (1 ml) ol 
normal and Cadwell PCR mix (Cadwell and Joyce mqqp^ PnR Methods and Applications 2: 28-33) for error prone 
PCR. The PCR program was 70»C for 2 minutes initially and then 30 cycles of 94»C for 30 seconds. 52*»C for 30 second 
and 72*C for 3 minutes and 6 seconds per cycle, followed by 72»C for 10 minutes. 

[0456] The primers used in the PCR reaction to create the one Bla gene control plasmid were Pnmer 2650 (SEQ ID 
NO- 50) and Primer 271 9 (SEQ ID NO; 55) 5" TTAAGGGATTTTGGTCATGAGATT 3'. This resulted in a mixed population 
of amplified DNA fragments, designated collectively as Fragment #59. These fragments had a number of different 

mutations. r,i-i-n/ccr\ 
[0457] The primers used in two different PCR reactions to create the two Bla gene plasmids were Primer 2650 (SEQ 
ID NO- 50) and Primer 2649 (SEQ ID NO: 51 ) for the first gene and Primers 2648 (SEQ ID NO: 54) and Primer 2719 
(SEQ ID NO- 55) for the second gene. This resulted in a mixed population of each of the two amplified DNA fragments: 
Fragment #89 (amplified with primers 2648 and 2719) and Fragment #90 (amplified with primers 2650 and 2649). In 
each case a number of different mutations had been introduced the mixed population of each of the fragments. 
[0458] After error prone PCR. the population of amplified DNA fragment #59 was digested with Sm . and then cloned 
into pBR322TetSfi-Sfi to create a mixed population of the plasmid pBR322Sfl-Bla-SfiV 

[0459] After error prone PCR. the population of amplified DNA fragments #90 and #89 was digested with Sffl and 
BsDCI at 50»C. and ligated into pBR322TetSfi-Sfl to create a mixed population of the plasmid pBR322TetSfi-2Bla-Sf|i 

[0^]°\he plasmids pBR322Sfi-B!a-Sf|i and pBR322Sft-2Bla-Sfii were electroporated into £. coli JC8679 and 
placed on agar plates having differing concentrations of cefotaxime to select for resistant strains and on tetracycline 
plates to titre. 

[0461] An equal number of colonies (based on the number of colonies growing on tetracycline) were picked, grown 
in LB-tet and DNA extracted from the colonies. This was one round of the recombination. This DNA was digested with 
Eci\ and used for a second round of error-prone PCR as described above. 

[0462] After five rounds the MIC (minimum inhibitory concentration) for cefotaxime for the one fragment plasmid was 
0.32 whereas the MIC for the two fragment plasmid was 1 .28. The results show that after five cycles the resistance 
obtained with recombination was four-fold higher In the presence of In vivo recombination. 

Example 11. In vivo recombination via electropo ration of fragments 

[0463] Competent E. co// cells containing pUCl8Sfi-Bla-Sfi were prepared as described. Plasmid pUC18Sfl-Bla-Sfi 
contains the standard TEM-1 beta- lactamase gene as described, supra. nn^. 
[0464] A TEM-1 derived cefotaxime resistance gene from pUCl8Sfi^;ef-Sfi, (clone ST2) (Stemmer WPC (1994) 
Nature 370- 389-91 . incorporated herein by reference) which confers on £ co// carrying the plasmid an MIC of 640 mq/ 
^JTfoTcefotaxime. was obtained. In one experiment the complete plasmid pUCl8Sfi-cef-Sfi DNA was electroporated 
into E. coli cells having the plasmid pUC18Sfi-Bla-Sfi. 

[0465] In another experiment the DNA fragment containing the cefotaxime gene from pUCI 8Sfi-cef-Sfi was amplified 
by PCR using the primers 2650 (SEQ ID NO: 50) and 2719 (SEQ ID NO: 55). The resulting 1 kb PCR product was 
digested into DNA fragments of <100 bp by DNase and these fragments were electroporated into the competent E 
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co//cells which already contained pUC18Sfi-Bla-Sfi. ^ , ». 

104661 The transformed cells from both experiments were then assayed for their resistance to cefotaxime by plating 
the transformed cells onto agar plates having varying concentrations of cefotaxime. The results are indicated in Table 7. 



TABLE 7 



Colonies/ Cefotaxime Concentration 








0.16 


0.32 


1.28 


5.0 


10.0 


no DNA control 


14 










ST-2 mutant, whole 




4000 


2000 


800 


400 


ST-2 mutant, fragments 




1000 


120 


22 


7 


Wildtype, whole 


27 










Wildtype, fragments 


18 











[04671 From the results it appears that the whole ST-2 Cat gene was inserted into either the bacterial genome or the 
plasmid after electroporation. Because most insertions are homologous, it is expected that the gene was inserted into 
the plasmid. replacing the wildtype gene. The fragments of the Cef gene from St-2 also inserted efficiently into the 
wild-type gene in the plasmid. No sharp increase in cefotaxime resistance was observed with the introduction of the 
wildtype gene (whole or in fragments) and no DNA. Therefore, the ST-2 fragments were shown to yield much greater 
cefotaxime resistance than the wild-type fragments. It was contemplated that repeated insertions of fragments, pre- 
pared from increasing resistant gene pools would lead to increasing resistance. 

[04681 Accordingly, those colonies that produced increased cefotaxime resistance with the St-2 gene fragments were 
isolated and the plasmid DNA extracted. This DNA was amplified using PGR by the method described above. The 
amplified DNA was digested with DNase into fragments (<100 bp) and 2-4 ^g of the fragments were electroporated 
into competent E. coli cells already containing pUC322Sfi-Bla-Sfi as described above. The transfomaed cells were 
plated on agar containing varying concentrations of cefotaxime. ^ . * , 

[0469] As a control, competent E. coli cells having the plasmid pUCI 8Sfi-Kan-Sfi were also used. DNA fragments 
from the digestion of the PGR product of pUCl8Sfi-cef-Sfi were electroporated into these cells. There is no homology 
between the kanamycin gene and the beta-lactamase gene and thus recombination should not occur. 
[0470] This experiment was repeated for 2 rounds and the results are shown in Table 8. 

TABLE 8 



Round 


Cef cone. 


KAN control 


Cef resistant colonies || 


1 replate 


0.16-0.64 


lawn 


lawn 1 




0.32 


10 small 


1000 1 


2 Replate 


10 


10 


400 1 






lOOsm @ 2.5 


50 @ 10 1 


3 


40 


100 sm 






1280 




100 sm 1 



Example 12 Determination of Recombinati on Formats 

[0471] This experiment was designed to determine which format of recombination generated the most recombinants 

per cycle. , j 

[0472] In the first approach, the vector pUC18Sfi-Bla-Sfi was amplified with PCR primers to generate a large and 
small fragment. The large fragment had the plasmid and ends having portions of the Bla gene, and the small fragment 
coded for the middle of the Bla gene. A third fragment having the complete Bla gene was created using PCR by the 
method in Example 8. The larger plasmid fragment and the fragment containing the complete Bla gene were electro- 
porated into E. CO// JC8679 cells at the same time by the method described above and the transformants plated on 
differing concentrations of cefotaxime. , ■ i * ^ 

[0473] In approach 2 the vector pUCI 8Sfi-BIa-Sfl was amplified to produce the large plasmid fragment isolated as 
in approach 1 above. The two fragments each comprising a portion of the complete Bla gene, such that the two f rag- 
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ments toqether spanned the complete Bla gene werealso obtained by PGR. The large plasmid fragment and the two 
Bla gene fragments were all electroporated into competent E co// JC8679 cells and the transformants plated on varying 

[M74r'^ln*the thi^d ap^Mch, both the vector and the plasmid were electroporated into E. coW JC8679 cells and the 
transformants were plated on varying concentrations of cefotaxime. 

[0475] In the fourth approach, the complete Bla gene was electroporated into E co// JC8679 cells already containing 
the vector pUCSfi-Sfi and the transformants were plated on varying concentrations of cefotaxime. As controls, the £ 
CO// JC8679 cells were electroporated with either the complete Bla gene or the vector alone. 
[04761 The results are presented in Figure 11 . The efficiency of the insertion of two fragments into the vector is 100 
X lower than when one fragment having the complete Bla gene is used. Approach 3 indicated that the efficiency of 
insertion does depend on the presence of free DNA ends since no recombinants were obtained with this aPP^oach. 
However the results of approach 3 were also due to the low efficiency of electroporation of the vector. When the 
expression vector is already in the competent cells, the efficiency of the vector electroporation is not longer a factor 
and efficient homologous recombination can be achieved even with uncut vector. 

Example 12. Kit for cassette shuffling to optimize vector performance 

[04771 In order to provide a vector capable of conferring an optimized phenotype (e.g.. maximal expression of a 
vector-encoded sequence, such as a cloned gene), a kit is provided comprising a variety of cassettes which can be 
shuffled and optimized shufflants can be selected Figure 12 shows schematically one embodiment, with each loc. 
having a plurality of cassettes. For example, in a bacterial expression system. Figure 13 shows ^^^^P'^^^^^^^^^ 
that are used at the respective loci. Each cassette of a given locus (e.g.. all promoters in this example) are flanked by 
substantially identical sequences capable of overlapping the flanking sequence(s) of cassettes of an adjacent locus 
and preferabNr also capable of participating in homologous recombination or non^^omologous recombination (e.g., tox/ 
ere or flp/frt systems), so as to afford shuffling of cassettes within a locus but substantially not between loci. 
[04781 cassettes are supplied in the kit as PGR fragments, which each cassette type or individual cassette spec.es 
packaqed in a separate tube. Vector libraries are created by combining the contents of tubes to assemble whole plas- 
mids or substantial portions thereof by hybridization of the overlapping flanking sequences of cassettes at each locus 
with cassettes at the adjacent loci. The assembled vector is ligated to a predetemiined gene of interest to forrn a vector 
library wherein each library member comprises the predetermined gene of interest and l<^^^^^^^'^^;i ^"'^^^^^^ 
dete/mined by the association of cassettes. The vectors are transferred into a suitable host cell and the cells are 
cultured under conditions suitable for expression, and the desired phenotype is selected. 

Example 13. Shuffling to optimize Green Fluorescent Pro tein (GfP) properties 
Background 

[04791 Greennuor9Scenlprotein(-GFP-)isapolypeptidederived(romanapopeplidBhaving238amirK>ac^^ 
and a molecular weight of approximately 27,000. GFP contains a chromophore lomned from ammo acid resKJues 65 
Csh 67TL name indi'^tes, GFPfluoresces; « does not bioluminesce lite LcNerase. In vjvo, the chror^Phore 
of GFP is activated by energy transfer from coelenterazine complexed with the photoprotein aequonn, with GFP ex- 
hibKing green fluorescence at 510 nm. Upon irradiatior, with blue or UV light, GFP exhibits green fluorescence at 

S[''^h?gfe9nTuorescent protein (GFP) of the jellyfish Asquoma victoria Is a very useful '^fx^f'^°'^^V';, 
□rassL and regulation (Prasher et al. (1 992) GeneVTU 229; Prashe, et al. (1995) Trends In GenetKiSlL 320, Chalfie 
a TsS^io-- 26k 802, Wrated herein by reference). W095/21191 discloses a P°ly""" ^^^ch 67 
eneodino aZX^n^M GFP apoprotein which contains a chromophore fomned from ammo acids 65 through 67. 
Smf^^lose that a modifica«on of the cDNA for the apopeptide of A. victoria GFP results ,n synthesis of a 
So having a«ered fluorescent propenies. A mutant GFP (S65T) resulting in a 4.6.f^d improvement ,n exctetion 
amplitude has been reported (Helm el al. (1994) Proc. Natl. Acad. Sci. (U.S.A.) 91: 12501). 

OvewlBW 

[0481] Green fluorescent protein (GFP) has rapidly become a widely used reporter of gene <^^^^'^ J^^- 
many organisms, particularly eukaryotes, the whole cell fluorescence signal was found to be too low Th^^ ° 
imprLe me whole cell fluorescence of GFP for use as a reporter for gene regu^tron for E^oo/rand mamn^lan cejs^ 
The improvement of GFP by rational design appeared difficult because the qu^m"™ 

(Ward el al. (1982) Phninrhem Photobiol. 35; 803) and the expression level of GFP in a standard £ co« construct 



55 



EP0 911 396 A2 

was already about 75% of total protein. 

[04821 improvement of GFP was performed first by synthesis of a GFP gene with improved codon usage. The GFP 
qene was then further improved by the disclosed method(s). consisting of recursive cycles of DNA shuffling or sexual 
PGR of the GFP gene, combined with visual selection of the brightest clones. The whole cell fluorescence signal in E. 
CO// was optimized and selected mutants were then assayed to determine performance of the best GFP mutants in 

rS gene was synthesized having improved codon usage and having a 2.8-fold iniprovernent of the 

E CO// whole cell fluorescence signal compared to the industry standard GFP construct (Clontech, Palo Arto OA). An 
additional 16-fold improvement was obtained from three cycles of sexual PGR and visual screening for the brightes 
E co//colonies for a 45-fold improvement over the standard construct. Expressed in Chinese Hamster Ovary (CHO 
cells this shuffled mutant showed a 42-fold improvement of signal over the synthetic construct. The expression level 
in E." CO// was unaltered at about 75% of total protein. The emission and excitation maxima of the GFP were also 
unchanged Whereas in E. colimosX of the wildtype GFP ends up in inclusion bodies, unable to activate its chromophore, 
most of the mutant protein(s) were soluble and active. The three amino acid mutations thus guide the mutant proteiri 
into the native folding pathway rather than toward aggregation. The results show that DNA sequence shuffling (sexual 
PGR) can solve complex practical problems and generate advantageous mutant variants rapidly and efficiently. 

MATERIALS AND METHODS 
GFP gene construction 

r04841 A gene encoding the GFP protein with the published sequence (Prasher et al. (1995) OE^it. incorporated 
herein by reference) (238 AA. 27 kD) was constructed from oligonucleotides. In contrast to the commercially available 
GFP construct (Clontech. Palo Alto. OA), the sequence included the Ala residue after the fMet. as found in the onginal 
cDNA clone. Fourteen oligonucleotides ranging from 54 to 85 bases were assembled as seven pairs by PC" extension. 
These segments were digested with restriction enzymes and cloned separately into the vector Alpha+GFP (Wh.tehom . 
et al (1995) Biotechnology 13: 1215, incorporated herein by reference) and sequenced. These segments were then 
ligated into the eukaryotic expression vector Alphas to form the full-length GFP construct. Alpha+GFP (F.g^14) The 
resulting GFP gene contained altered Arginine codons at amino acid positions 73 (CGT), 80 (CGG). 96 (CGC) and 
1 22 (CGT) To reduce codon bias andfacilitate expression in E coli, a number of other silent mutations were engineered 
into the sequence to create the restriction sites used in the assembly of the gene. These were S2 (AGT t° fGC; to 
reatean Nhel site). K41 (AAAto AAG; HinDlll). Y74(TACtoTAT)andP75(CCAtoCCG; B;P^)JJ°^^^^^^^ 
Nnul) LI 41 (CTC to TTG) and El 42 (GAA to GAG; Xhol). SI 75 ( TOG to AGO; BamHl) and 8202 (TCG to TGC. Sail). 
The 5* and 3' untranslated ends of the gene contained Xbal and EcoRl sites, respectively. The sequence of the gene 
was confirmed by sequencing. ^ ... 

[04851 Other suitable GFP vectors and sequences can be obtained from the GenBank database, such as via Iriternet 
World W^e Web. as files: GVU36202. CVU36201. XXP35SGFP. XXU19282. XXU19279. XXU19277. XXU19276. 
AVGFP2 AVGFP1 XXU19281. XXU19280. XXU19278. AEVGFP. and XXU17997. which are incorporated herein by 
reference to the same extent as if the sequence files and comments were printed and inserted herein. 
r04861 The Xbal-EcoRI fragment of Alpha+GFP. containing the whole GFP gene, was subcloned into the prokaryo.tic 
expression vector pBADIS (Guzman et al. M995^ J. Bacteriol . 177: 4121). resulting in the bacterial expression vector 
PBAD18-GFP (Fig 14). In this vector GFP gene expression is under the control of the arabinose promote r/repressor 
(araBAD). which is inducible with arabinose (0.2%). Because this is the only construct with the original ammo acid 
sequence, it is referred to as wildtype GFP ('wt'). A GFP-expressing bacterial vector was obtained from Clontech (Palo 
Alto. CA). which is referred to herein as 'Clontech" construct. GFP expression from the 'Clontech construct requires 
IPTG induction. 

Gene shufflinq and selection 

[0487] An approximately 1 kb DNA fragment containing the whole GFP gene was f^^'^f^l^f^^^^ 
vector by PGR with primers 5"-TAGCGGATCCTACCTGACGC (near Nhel srte) and 5'GAAAATCTTCTCTCATCCG 
(near EcoRl site) and purified by Wizard PGR prep (Promega. Madison. Wl). This PGR product was digested into 
random fragments with DNase I (Sigma) and 50-300 bp fragments were purified from 2% low melting PO'nt «9arose 
gels The purified fragments were resuspended at 10-30 ng/ul in PGR mixture (Promega. Madison. Wl; 0.2 mM each 
dNTP/2 2 mM MgCI^/SO MM KCl/10 mM Tris-HCI. pH 9.0/0.1% Triton-X-lOO) with Taq DNA polymerase (Promega) 
and assembled (without primers) using a PGR program of 35 cycles of 94«C 30s. 45»C 30s 72'C 30s. as described 
in Stemmer. WPG n994^ Nature 370: 389. incorporated herein by reference. The product of this reaction was d luted 
40x into new PGR mix. and the full length product was amplified with the same two primers in a PGR of 25 cycles of 
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94°C 30s 50»C 30s 72»C 30s. followed by 72»C for 10 min. After digestion of the reassembled product with Nhel and 
EcoRI this library of point-mutated and in vitro recombined GFP genes was cloned back into the PBAD vector, elec- 
troporated into E co/ZTGI (Pharmacia), and plated on LB plates with 100 ug/ml ampicillin and 0.2 % arabinose to 
induce GFP expression from the arabinose promoter. 

5 

Mutant selection 

r04881 Over a standard UV light box (365 nm) the 40 brightest colonies were selected and pooled. The pool of 
colonies was used as the temptete for a PGR reaction to obtain a pool of GFP genes. Cycles 2 and 3 were pertomned 
10 identical tocycle 1 . The best mutant from cycle 3 was identified by growing colonies in microtiter plates and fluorescence 

spectrometry of the microliter plates. ^ ^ Q-«.wc*«rr,e iqi 

[0489] For characterization of mutants in E colh DNA sequencing was performed on an Applied Biosystems 391 

DNA sequencer. 
IS CHO cell expression of GFP 

r04901 The wiidtype and the cycle 2 and 3 mutant versions of the GFP gene were transferred into the eukaryotic 
expression vector Alphas (1 6) as an EcoRI-Xbal fragment. The ptasmids were transfected into CHO cells by electro- 
poration of 10^ cells in 0.8 ml with 40 ^g of plasmed at 400V and 260nF. Transfomiants were selected using 1 mg/ml 

20 G418 for 10-12 days. - . ■ , ♦ ^t^vioD 

[0491] FACS analysis was carried out on a Becton Dickinson FACSTAR Plus using an Argon ion laser tuned to 488 
nm. Fluorescence was obsen^ed with a 535/30 run bandpass filter. 

RESULTS 

2S 

Codon usage 

r04921 E CO// expressing the synthetic GFP construct ('wt') with altered codon usage yielded a nearly 3-fold greater 
whole cell fluorescence signal than cells expressing the 'Clontech' construct (Fig. 1 5A). The comparison was perforrned 
30 at full induction and at equal ODeoo- In addition to the substitution of poor arginine codons in the 'wf construct and the 
N-terminal extension present in the 'Clontech' construct, the expression vectors and GFP promoters are quite different. 
The cause of the Improved fluorescence signal is not enhanced expression level, it is improved protein performance. 
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Sexual PGR 

[04931 The fluorescence signal of the synthetic 'wt' GFP construct was further Improved by constructing a mutant 
library by sexual PGR methods as described herein and in Stemmer WPG (1 994) Proc. Natl. Acad. Set. fU.S.A.) 91: 
10747 and Stemmer WPC (1994) Nature 370: 389, incorporated herein by reference, followed by plating and selection 
of the brightest colonies. After the second cycle of sexual PGR and selection, a mutant ('cycle 2') was obtained tha 
was about 8-fold improved over 'wt', and 23-fold over the 'Clontech' construct. After the third cycle a mutant ( cycle 3 ) 
was obtained which was 16-18-fold improved over the 'wt' construct, and 45-fold over the 'Clontech' constnict (F^. 
15B) The peak wavelengths of the excitation and emission spectra of the mutants were identical to that of the wt 
construct (Fig 15B) SDS-PAGE analysis of whole ceils showed that the total level of the GFP protein expressed in 
all three constructs was unchanged, at a surprisingly high rate of about 75% of total protein (Fig. 16 pane s (a) and 
45 m Fracttonatton of the cells by sonication and centrifu gallon showed that the 'wt' construct contained mostty inactive 
GFP in the form of inclusion bodies, whereas the 'cycle 3' mutant GFP remained mostly soluble and was able to activate 
its chromophore. The mutant genes were sequenced and the 'cycle 1' mutant was found to contain rriore mutations 
than the 'cycle 3' mutant (Fig. 1 7). The 'cycle 3' contained 3 protein mutations and 3 silent mutations relative to the wt 
construct Mutations F100S. M154T and V164A involve the replacement of hydrophobic residues with more hydrophihc 
so residues (Kyle and Doolittle. 1 982). One plausible explanation is that native GFP has a hydrophobic site on its surface 
by which it normally binds to Aequorin. or to another protein. In the absence of this other protein, the hydrophobic site 
may cause aggregation and prevent autocatatytic activation of the chromophore. The three hydrophilc mutations may 
counteract the hydrophobic site, resulting in reduced aggregation and increased chromophore activation. Pulse chase 
experiments with whole bacteria at 37-C showed that the T^^ for fluorophore fomnation was 95 minutes lor both the 
55 'wt' and the 'cycle 3' mutant GFP. 
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CHO ceils 



r04941 Improvements in autonomous characteristics such as self-folding can be tranferable to different cellular en- 
vironments After being selected in bacteria, the 'cycle 3' mutant GFP was transferred into the eukaryotic Alpha Rector 
and expressed in Chinese hamster ovary cells (CHO). Whereas in E. co//the 'cycle 3' construct gave a l6-ie-fold 
stronger signal than the 'wt' construct, fluorescence spectroscopy of CHO cells expressing the 'cycle fj^^f;^ showed 
a 42-fold greater whole cell fluorescence signal than the 'wt' construct under identical conditions (Fig. ISA). FACS 
sorting confimied that the average fluorescence signal of CHO cell clones expressing 'cycle 3' was 46- °ld greater 
than cells expressing the 'wt' construct (Fig. 18B). As for the 'wt' construct, the addition of 2 mM sodium butyrate was 
found to increase the fluorescence signal about 4-8 fold. 

Screening versus selection 

r0495] These results were obtained by visual screening of approximately 10.000 colonies, and the brightest 40 col- 
onies were picked at each cycle. Significant improvements in protein function can be obtained with relatively low num- 
bers of variants. In view of this surprising finding, sexual PCR can be combined with high throughput screening pro- 
cedures as an improved process for the optimizatbn of the large number of commercially important enzymes for which 
large scale mutant selections are not feasible or efficient. 

Example 14. Shuffling to Generate Improv ed Peptide Display Libraries 
Background 

r04961 Once recombinants have been characterized from a phage display library, polysome display library, or the 
like it is often useful to construct and screen a second generation library that displays variants of the originally displayed 
sequence(s) However, because the number of combinations for polypeptides longer than seven residues is so great 
that all permutations will not generally be present in the primary library. Furthemnore. by mutating sequences, the 
■sequence landscape" around the isolated sequence can be examined to find local optima. 

[0497] There are several methods available to the experimenter for the purposes of mutagenesis. For example, 
suitable methods include site-directed mutagenesis, cassette mutagenesis,- and error-prone PCR. 



Overview 



r0498] The disclosed method for generating mutations in vitro is known as DNA shuffling. In an embodiment of DNA 
35 shufflinq genes are broken into small, random fragments with DNase I. and then reassembled in a PCR-like reactron. 
but typically without any primers. The process of reassembling can be mutagenic in the absence of a proof-reading 
polymerase, generating up to about 0.7% error rate. These mutations consist of both transitions and transversion. often 
randomly distributed over the length of the reassembled segment. 

[04991 Once one has isolated a phage^displayed recombinant with desirable properties, it is generally appropnate 
40 to improve or alter the binding properties through a round of molecular evolution via A shuffling. Second genera^on 
libraries of displayed peptides and antibodies were generated and isolated phage with improved (i.e.. 3-1000 fold) 
apparent binding strength were produced. Thus, through repeated rounds of library generation and selection it is pos- 
sible to •hill-climb" through sequence space to optimal binding. u • , * ^ e«,«^ti„„ «„rirhrT,«nt 
[05001 From second generation libraries, very often stronger binding species can be isolated. Selective enrichrnent 
45 of such phage can be accomplished by screening with tower target concentrations immobilized on a microriter plate 
or in solution, combined with extensive washing or by other means known in the art. Another option is to display the 
mutagenized populatbn of molecules at a bwer valency on phage to select for molecules with higher affinity constants. 
FinalS, it is possible to screen second generation libraries in the presence of a low concentration of binding inhibitor 
(i.e., target, ligand) that blocks the efficient binding of the parental phage. 

50 

Methods 

Exemplan^ Mutagenesis Protocols 

55 [05011 A form of recombinant DNA-based mutagenesis is known as oligonucleotide-mediated site-directed muta- 
genesis. An oligonucleotide is designed such that can it base-pair to a target DNA, while differing in one or more bases 
near the center of the oligonucleotide. When this oligonucleotide is base-paired to the single-stranded temp ate DMA. 
the heteroduplex is converted into double -stranded DNA in vitro; in this manner one strand of the product will carry the 
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nucleotide sequence specific by the mutagenic oligonucleotide. These DNA molecules are then propagated jn yjvo 
and the desired recombinant is ultimately identified among the population of transformants. 
[0502] A protocol for single-stranded mutagenesis is described below. 

1 Prepare single-stranded DNA from Ml 3 phage or phagemids. Isolate "2 ng of DNA. The DNA can be isolated 
from a dut-ung- bacteria) host (source) so that the recovered DNA contains uracil in place of many thymine residues. 

2 Design an oligonucleotide that has at least 1 5 or 20 residues of complementarity to the coding regions flanking 
the site to be mutated. In the oligonucleotide, the region to be randomized can be represented by degenerate 
codons If the non complementary region is large (i.e.. > 12 nucleotides), then the flanking regions should be 
extended to ensure proper base pairing. The oligonucleotide should be synthesized with a 5PO4 group as it 
improves the efficiency of the mutagenesis procedure; this group can also be added enzymatically wrth T4 pofy- 
nucleotide kinase. (In an Eppendorf tube, incubate 100 ng of oligonucleotide with 2 units of T4 polynucleotide 
kinase in 50 mM (pH 7.5), 10 mM MgCI2,5 mM DTT. and OA mM ATP for 30 mm. 

3 Anneal the oligonucleotide with the single-stranded DNA in a 500 ^il Eppendorf tube containing: 
mg single-stranded DNA. 10 ng oligonucleotide. 20 mM Tr@Cl (pH 7.4). 2 mM MgClg. 50 mM NaCL 

4 Mix the solutions together and centrifuge the tube for a few seconds to recollect the liquid. Heat the tube in a 
flask containing water healed to 70»C. After 5 min, transfer the flask to the lab bench and let it cool to room 

nak^thMubrout of the water bath and put it on ice. Add the following reagents to the tube, for a total volume 
of 100^1: 20 mM Tris-HCI (pH 7.4). 2 mM DTT. 0.5 mM dATP. dCTR dGTP and dTTP. 0.4 mM ATP. 1 unit T7 DNA 
polymerase, 2 units T4 DNA ligase. 

6 After 1 hr, add EDTA to 1 0 mM final concentration. 

7 Take 20 ul from the sample and run on an agarose gel. Most of the single-stranded DNA should be converted 
to covalentlynclosed circular DNA. Electrophorese some controls in adjacent lanes (i.e. , template, template reaction 
without oligonucleotide). Add T4 DNA ligase to close the double-stranded circular DNA. 

8. Extract the remainder of the DNA (80 ^1) by phenol extraction and recover by ethanol precipitation. 

9. Electroporate into ung+ bacteria. 

10. Han/est the second generation phage by PEG precipitation. 
Cassette mutagenesis 

[05031 A convenient means of introducing mutations at a particular site within a coding region is by cassette muta- 
genesis. The -cassette- can be generated several different ways: A) by annealing two oligonucleotides together and 
converting them into double stranded DNA; B) by first amplHying segments of DNA wrth oligonucleotides that carry 
randomized sequences and then reamplifying the DNA to create the cassette for cloning; C) by first amplifying each 
half of the DNA segment wrth oligonucleotides that carry randomized sequences, and then heating the two pieces 
together to create the cassette for cloning; and D) by error-prone PGR. The cassettes formed by these four procedures 
are fixed in length and coding frame, but have codons which are unspecified at a low frequency. Thus cloning and 
expression of the cassettes will generate a plurality of peptides or proteins that have one or more mutant residues 
along the entire length of the cassette. . . u ^;„«i«„«h « retain 

roSM] Typically, two types of mutagenesis scheme can be used. First, certain residues in a Phage<l splayed protein 
^peptide can be completely randomized. The codons at these posrtions can be NNN, NNK. or NNS which use 32 
codons to encode all 20 residues. They can also be synthesized as prefomied triplets or by mixing oligonucleotides 
synthesized by the split-resin method which together cover all 20 codons at each desired posrtion. Converse^, a subset 
0? codons can be used to favor certain amino acids and exclude others. Second, all of the codons th« cas^f^^^^^^^^^^^ 
have some low probabilrty of being mutated. This is accomplished by synthesized oligonucleotides wrth bottles spiked 
with the other three bases or by attering the ratio of oligonucleotides mixed together by the split-f esin me hod. 
rosoS] For mutagenesis of short regions, cassette mutagenesis with synthetk; oligonucleotide is general y preferred. 
More than one cassette can be used at a time to alter several regions simultaneously. This approach is preferred when 
creating a library of mutant antibodies, where all six complementarity determining regions (CDR) are altered concur- 



rently. 
Random codons 



[0506] 

1. Design oligonucleotides with both fixed and mutated posrtions. The fixed posrtions should correspond to the 
cloning sites and those coding regions presumed to be essential for binding or function. 
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2 During synthesis of the oligonucleotide, have the oligonucleotide synthesizer deliver equimolar amounts of each 
base for N, guanosine and cytosine for K, guanosine and thymidine for S. 



'Spiked' codons 
[0507] 



1 Design oligonucleotides with both fixed and mutated positions. The fixed positions should ^o^^e^P^^f 
cloning sites and those coding regions presumed to be essential for binding or function The W 
n errors in an m long polynucleotide cassette synthesized with x fraction of the other three nucleotides at each 
position is represented by. 



P=[m!/(mHi)n!l[x"]l1-xl" 



2. During synthesis ot the oligonucleotide switch oiit the t>ase bottles. Use bottles wrth 100% of each base to the 
fixed positions and bottle with 100-x% of one ba^e and x/3% of each ot the other three bases. The dop^g rat» 
can also differ based on the average amino acid use in natural globular proteins or other algonthrns. There is a 
commercially available computer program, CyberDope, whfch can be used to aid in detemiining the base mixtures 
torTntSngollgonudeotWes with particular doping schemes. Ademonstratior copy of the CyberDope program 
can be obtained by sending an email request to cyberdopeaaol.com. 



Directed codons 
[OS08] 



1 Design oligonucleotides with both fixed and mutated positions. The fixed positlors should correspond to the 
cloning sites and those coding regions presun»d to be essential for binding or function. One "^^Jl^^^^ 
described for Inserting a set of oligonucleotides at a specific restrfction enzyme srte that encodes all twenty ammo 
acids (Kegler-Ebo el al. (1 994) Nucl. Acids Res. 22: 1 593, incorporated herein by reference). 
2. During synthesis of the oligonucleotide split the resin at each codon. 



Enor-Prone PGR 



[0509] There are several protocols based on altering standard PGR conditions (Saiki et al (1988) Sosffism 487, 

n^orporated herein by reference) to elevate the level of mutation during amplification. Addajon of f ^ated dNTP^^^ 

centrations and/or Mn-^ increase the rate of mutation signmcantly. Since the -nutations are '^^^"<?^^ 

random, this is one mechanism tor generating poputaticx^s of novel proteins. On the 'f'' ^^"^'^ZIZ^llll 
not wellsuited for altering Short pepUde sequences because the codingregions are short,andmerateo(^^^^^^ 

be too low to generate an adequate number of mutants tor selection, nor is it ideal tor long proteins, because there will 

be many mutations within the coding region which complicates analysis. 

1 Design oligonucleotide primers that flank the coding region of interest in the phage. They are often 

21 nucleotides in length and flank the region to be mutagenized. The fragment to be amplified can carry restriction 

sites within it to permit easy subcksning in the appropriate vector. 

' ''V*nrrprimeTl"p-.e o, the DNA template; 100 mM NaGl. 1 mM MnCfe, 1 mM DTT, 0.2 mN. of 
each dNTP, 2 units ol Taq DNA polymerase. 

3 Cover the liquid with mineral oil. ,^ ^ ,,u c«r 

4" Cycle 24 tirnes between 30 sec at 94-C. 30 sec 45'C. and 30 sec at 720C to amplify fragments up to 1 kb. For 
longer fragments, the 72"C step is lengthened by approximately 30 sec for each kb- 

5 Extend the PGR reaction for 5-10 min at 72"C to increase the fraction of molecules that are full-length. This is 
important if the fragment termini contain restriction sites that will be used in subcloning later. 

6. The PGR reaction is optionally monitored by gel electrophoresis. ,«etrintinn 

7. The PCR product is digested with the appropriate restriction enzyme(s) to generate sticky ends. The restriction 
fragments can be gel purified. ^. . u ♦-^n^ 

8. The DNA segment is cloned into a suitable vector by ligation and introduced into host cells. 
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DNA Shuffling 



r0510] in DNA shuffling, genes are broken Into small, randonn fragments with a phosphodiester bond lytic agent, 
such as DNase I and then reassembled in a PCR-like reaction, but without requirement for any added pnmers. The 
process of reassembling can be mutagenic in the absence of a proof-reading polymerase, generating up to approxi- 
mately 0.7% error when 10-50 bp fragments are used. 

1 PGR amplify the fragment to be shuffled. Often it is convenient to PGR from a bacterial colony or plaque. Touch 
the colony or plaque with a sterile toothpick and swirl in a PGR reaction mix (buffer, deoxynucleotides oligonucle- 
otide primers). Remove the toothpick and beat the reaction for 10 min at SS^C^ Cool the ^^^^^'"^'l^ 

units of Taq DNA polymerase, and cycle the reaction 35 times for 30 sec at 94-G, 30 sec at 45»C 30 sec at 72 G. 
and finally heat the sample for 5 min at 72'G. (Gh,en conditions are for a 1 kb gene and are modified according 
the the length of the sequence as described.) 

2 Remove the free primers. Complete primer removal is important. • i «f 
3" Approximately 2-4 ^g of the DNA is fragmented wfth 0.15 units of DNase I (Sigma, St. Louis. MO) in 100 nl 0 
50 mM Tris-HCt (pH 7.4). 1 mM MgGla. for 5-10 min at room temperature. Freeze on dry ice, check size rarige of 
fragments on 2% low melting point agarose ge! or equivalent, and thaw to continue digestion until desired s^e 
range is used. The desired size range depends on the application; for shuffling of a 1 kb gene, fragments of 1 00-300 
bases are normally adequate. • ■ ♦ a 

4 The desired DNA fragment size range is gel purified from a 2% low melting point agarose gel or equwalent A 
preferred method is to Insert a small piece of Whatman DE^1 ionexchange paper just in front of the DNA. run the 
DNA into the paper, put the paper in 0.5 ml1.2 M NaGI in TE, vortex 30 sec. then carefully spin out all the paper, 
transfer the supematant and add 2 volumes of 100% ethanol to preciprtate the DNA; no cooling of the sample 
should be necessary. The DNA pellet is then washed with 70% ethanol to remove traces of salt. 

5 The DNA pellet is resuspended in PGR mix (Promega. Madison. Wl) containing 0.2 mM each DNTR 2.2 mWI 
MqCU 50 mM KGI, 10 mM Tris-HCI. pH 9.0. 0.1% Triton X100. at a concentration of about 10-30 ng o* fragments 
per ul of PGR mix (typically 100-600 ng per 10-20 nl PGR reaction). Primers are not required to be added in this 
PGR reaction Taq DNA polymerase (Promega. Madison. Wl) alone can be used if a substantial rate of mutagenesis 
(up to 0 7% with 10-50 bp DNA fragments) is desired. The inclusion of a proof-reading polymerase, such as a 1 . 
30 (votA/ol) mixture of Taq and Pf u DNA polymerase (Stratagene. San Diego, GA) is expected to yield a lower error 
rate and allows the PGR of very long sequences. A program of 30-45 cycles of 30 sec 94»G. 30 sec 45-50 C, 30 
sec 72'G hold at 4'G is used in an MJ Research PTC-150 minicycler (Gambridge. MA). The progress of the 
assembly' can be checked by gel analysis. The PGR product at this point contains the correct size product in a 
smear of larger and smaller sizes. 

6 The correctly reassembled product of this first PGR is amplified in a second PGR reaction which contains outside 
primers Aliquots of 7.5 ^l of the PGR reassembly are diluted 40x with PGR mbc containing 0.8 pM of each primer 
A PGR program of 20 cycles of 30 sec 94'G. 30 sec 50'G. and 30-45 sec at 72»G is run. with 5 min at 72''C at the 



^The desired PGR product is then digested with terminal restriction enzymes, gel purified, and cloned back into 
a vector, which Is often Introduced into a host celt. 

rosill Site-specific recombination can also be used, for example, to shuffle heavy and light antibody chains inside 
Infected bacterial cells as a means of Increasing the binding affinity and specificity of antibody ^'^^"J^^- '1^^^^ 
to use the Cra/7ox system (Watertiouse et al. (1993) Nucl. Acids Res. 21: 2265; Gnffiths et al. (1994) EMBOJ^IS. 
3245. incorporated by reference) and the int system. " , „ t^^t 

[05121 It is possible to take recombinants and to shuffle them together to combine advantageous mutations that 
occur on different DNA molecules and it Is also possible to take a recombinant displayed insert and to backcross wrth 
parental sequences by DNA shuffling to remove any mutations that do not contribute to the desired traits. 

Example 15. Shuffling to Generate Improved Arsen ate Detoxification Bacteria 

[0S131 Arsenic detoxtfication is important for goldmining of arsenopyrite containing gold ores and other uses, such 
as environmental remediation. Plasmid pGJI 03. containing an operon f f 

et al (1989) Bacteriol 171: 83. incorporated herein by reference), was obtained from Prof. Simon Silver (U. of Illinois, 
Ghicago, ID.T^^GTTontaining pJG103. containing the p1258 ars operon cloned intopUCl9. a MlC (m,nimunn 
inhibrtory concentratbn) of 4 ng/ml on LB amp plates. The whole 5.5 kb plasmid was fragmented w,lh DNAse 1 into 
fragments of 100-1000 bp. and reassembled by PGR using the Perkin Elmer XL-PGR reagents. After assembling, the 
plasmid was digested with the unique restriction enzyme BamHI. The full length monomer was purified from the agarose 
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ael llqated and electtoporated into E «>/,TG1 calls. The tranlormed cells were plates on a range ol sodium arsenate 
^cSions (2 4, 8 16 mM in round 1 ), and approx. 1000 colonies from the plates with the highest arsenate levels 
rrrpooledbyscapingthepUesThecells were grown in liquid in the presenceof the sameconcentrationofarsen^^ 
and pSd was prepared fr^ this culture. Round 2 and 3 were identical to round 1 . except that the celte were plated 
at hi^alnate levels. 8, 16. 32, 64 mM were used for round 2; and 32, 64, 128, 256 mM were used tor select^n 

°M?r The best mutants grew overnight at up to 128 mM arsenate (MIC=256). 

Lproved strains showed that the TGI (wildtype pGGJIOS) grew in liquid at up to 10 mM, whereas the shuffled TG1 
(mutant PGJ103) grew at up to 150 mM arsenate concentration. , ,, . ■ , 

K PCRpro^ramfortheassemblywas94-C20s,50x(94»C 15s,50-C1 min,72-C30s.2s/cycle), usingacrcuiar 

I0^6}'7'l!!!T^^T<hTp,ocBi^ resulted in a 50-100-fold improvement in the resistance to arsenate conferred by 
the shuffled arsenate resistance operon; bacteria containing the improved operon grew on medium containing up to 

Rg're'l's shows enhancement of resistance to arsenate toxicity as a result ol shuffling the pGJ103 plasmki 
containing the arsenate detoxiflcation pathway operon. 

Examolo 16. Shulfling to Generate Improved Cadmium D atoxification Bacteria 

roSIBl Plasmid pYW333. containing an operon for mercury detoxification is a 1 5.5 kb plasmid containing at least 8 
genes encoding a pathway for me-cury detoxification (Wang et al. (1989) Bactem!. 171-83. incorporated herer, by 
Serence), was obtained from Prof. Simon Silver (Un^. Illinois, Chicago, IL). 400-1500 bp ragments ««« obtained 
as described supra and assembled with the XL-PCR reagents. After direct electroporatior, of the assembled DMA nto 
io £ coA TGI the cells were plated on a range of levels of mercury chloride (Sigma) under a simitar protocol as that 
described for arsenate in Example 1 5. The inittal MIC of mercury was 50-70 pM. Four cycles of whole Pl^s-n'd shuffling 
were pertomred and increased the detoxification measured as bacterial resistance to mercury from about 50-70 pM 
to over 1000 pM, a 15-20 fold improvement. 

Example 17 . Enhancement of Shuffling React ions bv Addition ol Cationic Detergent 

105191 The rate of renaturation of complementary DNA strands becomes limiting for the shuffling long, complex 
Uuences. This renaturation rate can be enhanced 10,000-fold by addition ol simple cationic •'^'^'g^"' <P°"^"^^"f 
Berg (1991 1 PNAS 88- 8237). The renaturation is specific and independent of up to a lO^-fold excess of heterologous 
DNA. in the ^n^ol these agents the rate which the complementary DNA stands encounter each other in solution 

S^'Ad'S °f ™AC in an assembly reaction ol a 1 5 kb plasmid foltowed by electroporation into E. coli resulted 

in the following results; 



TMAC (mM) 


# Colonies 


0 


3 


15 


88 


30 


301 


60 


15 


90 


3 



[0521] Addition of CTAB in an assembly reaction of a 1 5 kb plasmid followed by electroporation into E. coli resurted 
in the following results: 



CTAB (mM) 


# Colonies 


0 


3 


30 


34 


100 


14 


300 


0 
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Example 18 . fipq '^^^^Q Shuffl ing via PGR Stuttering 

hu *;tutterina was between 0. 1-10%. depending on conditions. 

olat^on a sltec rldia. The minimum inhibitoiy activity of these four constructs for a w,de variety of bet^ac am 

Sanua'p^^^^^^^^^^ DNA poVo^erase. such as DNA poll Klanow fragment. The .n,t.al PGR program 

with fragments in Klenow buffer at lO-SOng/ul and dNTPs: 



1 - Denature QA'C 20s 

2- Quick-coo!: dry ice ethanol 5s, ice 1 5s 

3- Add Klenow enzyme 

4- Annea!/extend 2 min 25'C 

5- cycle back to denature (cycle 1 ) 



This is repeatedfor10-20cyclestoinitiatethetemplateswitchihg. after which regular PCRwithheatstablepolymerases 
is continued for an additional 10-20 cycles to amplify the amount of product. 
Example 19. Shuffling of Antibody Phage Display Libraries 

105271 A stable and well-expressed human single^hain Fv framework (VH25i-V,A25) was Jf. 

sequences. The degree of mutagenesis of each residue was similar to its naturally occurring vanabiiity w-.hin its ^ 

m»8l'^"pCR oroduct containing the scFv gene was randomly fragmented with DNasel digestion and fragments of 
S,?^ bp^'e'rep^^e^ s'^t.ic'oligonucli.ides, each containing a m^^^^^ 

the scFv template were added to the random fragments at a 10:1 molar ratio. A library of full length, -""^t^ 
genef JasTeilembled from the fragments by sexual PCR. Clon^g into the pill protein of Ml 3 phage yieHed an Ab- 
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Bhaqe library ol 4 X 10' pteque-forming units. The combinations of mutant and native CDRs were characterced by 
corXpCRw°hprirnersspec»icforthenativeCDRs(see,Fig.7).AIIs« 

eSncy and a wide variety ol oombina.ions. Sequencing of the mutated CDRs showed that the obsen/ed mutatK>n 
[05T'™stphC"-spanned,o^ 

Lnd sivenoHhese targets yielded ELISA-posith,e clones. One target which resulted in posrtr/e clones v«s the hur^ 
G-CSF7eceptor. The l-CSF receptor positive clones were subjected to a second round and one quarter of second 
round phage clones were ELISA-positive for binding to the G-CSF receptor. ,^„„ 
wSor^Liverse pool was used to evaluate the suitability of three different sequence optimization s^ 
ero^a.PCR,error^,r^ePCR,andDNAshuffling).Anerasinglecycleofeachofthese^^^^^^^^^^ 
showed a sevenfoW advantage, both in the percentage of Ab^Jhage recovered and in the G-^SF roceptor spectfK: 
EUSA signal The panning was continued for six additional cycles, shuffling the pool of scFv genes after each round. 
^1 suZncy of se!^ was gradually increased to two one-hour washes at 50'C in PBS-Tween ,n t^-^ P^sence 

°om different cycles were assayed at identcal stringency, the percentage of phage '^.'If 1^.^ 

LcTe 2 to^cle 8, as shown in f^g. 31 . Individual phage clones from each round showed a similar '"=™asem specify 
ELISAsignL sequencing show^thatthescFv contained an average Of 34(n=4)aminoacidmutat.ons, of Which ™^ 

four were present in all sequences evaluated. 

Swi] in order to reduce po.entel immunogenicity, neutral or weak^ contributrng "^^^JXltlTr^ 
cycles Of backcrossing, with a 40-fo.d excess of a synthetically constructed germline scFv Qene foNowed by ^nngen 
inning The average number of amino acid mutations in the backcrossed scFvs were nearly halved to 8 (n 3)^ of 
wSiy four were present in all sequences. The backcrossed Ab phage clones were shown to "md straggly and 
: exTeln. specificKy to the human G<:SF receptor. Fig. 32 shows the effect of ten se^ec -nds fo seve^ 
human protein targets; six rounds of shuffling and two rounds of backcrossing were <»nducted. F g. 33 shows 'he 
reire recovery mtes ol phage, by panning with BSA, Ab 179, or G^SF receptor, after conventional PGR ( non- 
shuffled"), error-prone PGR, or recursive sequence recombination ("shuffled ). 

Example 20. Oolimizatcn of GFP in Mammalian Cells 

[05321 The plasmkJ vector pCMV-GFR which encodes GFP and expresses it under the control of a CMV promoter, 
was grown in TGI cells and used to transfect CHO cells for transient expression assays. 

rMSarPlasmid was rescued from FACS selected transient^r expressing TGI cells by a proteinase K method or a 
S memod SbRL). Basically, .he FACS collected cells were pelleted by centrifugatton " 
WnoubaterSv^th either proteinase K or PreTaq, pheno^chloroform extracted, ethanol precipitated, used to trans- 
form E. coli which were then plated on Amp plates. The results were: 



Proteinase K method ! input - 5 x 1 0* rescued 3x10* 
PreTaq method | input- Sx 10* rescued 2x10* 



I0S341 The rescued plasmid was grown up and 5 ng was partially digested with DNAsel and 50 to 700 bp fragments 
lere elJ^ fJoT— p,ecipita,L.and?esuspendedin33C^lof3. 

193 J4(W. PEG 80 Ml 10 mM dNTPs. 20 ul Tth polymerase. 2 gl Ptu polymerase, 7 mI TM AC (Sigma), amJ 367 (J 
if OPCRvIs abducted on a MJ Research PTC-150 minfcycler for 40 cycles (94'C, 30 sec; 50-C, 30 sec. 72-C 60 
se^^wi**resrsoprlTs,whk=hyi 

and\rga*inc<^sti,.red the entire pLmid. The PGR fragmen^^ 

gel purified, iigated, and electroporated into TGI ceils. Plasmid DNA was prepared and electroporated into CHO cells, 
which were screened by FACS for the cells transiently expressing the brightest GFP signals^ 
rasaS] While the present inventk.n has been described v^th reference to what are considered to ^^^ P'^'^"^ 
Sles it is to be understood that the inventton is not limited to the disclosed examples. To the contrary, the ,nvent»n 
HnTendetl to cover various modifications and equ^alent arrangements included within the spmt and scope of the 
appended claims. 
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Claims 

1 . A method of shuffling polynucleotides comprising: 

conducting a polynucleotide ampl^ication process on overlapping segments of a population of variants of a 
polynucleotide encoding a plurality of genes under conditions whereby one segment serves as a temp ate for 
extension of another segment to generate a population of recombinant polynucleotides at least one of which 

encodes the plurality of genes; and . . ^ • a ^r^, 

screening or selecting a recombinant pofynucleotide encoding the plurality of genes for a desired property 
conferred by the genes or their expression products. 

2. A method as claimed in claim 1 wherein the multiple genes are physically clustered in nature. 

3. A method as claimed in claim 1 wherein the multiple genes form a multtcomponent pathway 

4. A method as claimed in claim 3 wherein the expression products of the genes in the multicomponent pathway 
produce a secondary metabolite. 

5 A method as claimed in claim 4 wherein the secondary metabolite is a drug and the screening or selecting comprises 
■ spotting samples of the secondary metabolite on a lawn of cells and identifying a sample that results in clearing 
of the lawn Indicating the sample is toxic to the cells. 

6. A method as claimed in claim 4 wherein the secondary metabolite is an antibiotic. 

7. A method as claimed in claim 3 wherein the multiple gene pathway confers heavy metal resistance. 

8. A method as claimed in claim 3 wherein the multiple gene pathway confers arsenate resistance and the desired 
property is improved arsenate resistance. 

9. A method as claimed in claim 3 wherein the multiple gene pathway encodes at least eight genes conferring cad- 
mium resistance and the desired property is mercury resistance. 

10. A method as claimed in any one of claims 1 to 9 wherein the variants comprise natural variants. 

11 . A method as claimed in any one of claims 1 to 9 wherein the variants comprise induced variants. 

12. A method as claimed in claim 10 wherein the naturally-occurring variants of a polynucleotide comprise human 
polynucleotides. 

13. A method as claimed in claim 10 wherein the naturally-occurring variants of a polynucleotide comprise bacterial 
polynucleotides. 

14. A method as claimed in claim 10 wherein the naturallynoccurring variants of a polynucleotide comprise plant poly- 
nucleotides. 

15. A method as claimed in claim 10 wherein the naturally^urring variants of a polynucleotide comprise animal 
polynucleotides. 

16. A method as claimed in claim 10 wherein the naturally^currlng variants comprise allelic variants of the plurality 
of genes. 

17. A method as claimed in claim 10 wherein the naturalVoccurring variants comprise species variants of the plurality 
of genes. 

18. A method as claimed in any one of claims 1 to 17 further comprising random^ fragmenting the population of 
variants of a polynucleotide before conducting the amplification process. 

1 9. A method as claimed In claim 1 B wherein the population of variants of the polynucleotides are randomly fragmented 
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5 



by DNase I digestion. 



20 A method as claimed in any one of claims 1 to 1 9 wherein at least one cycle of the amplification process is conducted 
under conditions resulting in incomplete extension of the variants of the polynucleotide. 
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M 
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FIG. 5B. 
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